[ 
https://issues.apache.org/jira/browse/YARN-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327099#comment-15327099
 ] 

Sunil G commented on YARN-5219:
-------------------------------

I have made some analysis on this issue. One of the option was to use *trap* 
inside {{launch_container.sh}} script by capturing *ERR* signal. Since we are 
calling *export* on these variables, it was not raising ERR signal when Bad 
Substitution error popped up.
Interestingly shell exposes a way to have null check when any variables are 
substituted. Copying relevant information from StackExchange for this 
functionality.
{noformat}
${parameter:?[word]}

Indicate Error if Null or Unset. If parameter is unset or null, the expansion 
of word (or a message indicating it is unset if word is omitted) shall be 
written to standard error and the shell exits with a non-zero exit status. 
Otherwise, the value of parameter shall be substituted. An interactive shell 
need not exit.
{noformat}

I have attached a proposed script which will help to validate any variable and 
it stops the script execution if any substitution error comes. 
However this comes with adding few extra pieces of code inside 
*launch_container.sh* such as .
- New validator method named {{verify_export_variable}} which can validate each 
variable by doing an echo with null check.
- This method will stop and return error from the {{launch_container.sh}}

{noformat}
#! /bin/bash

verify_export_variable() {
  name=$1
  var=$2
  echo "Variable ${name} to be defined as ${var:?}"
}

# Few of the export statements which we can see from launch_container.sh.
export NM_PORT="45454"
export SPARK_YARN_CACHE_FILES_TIME_STAMPS="1461794083704"
export USER1="user1${foo.bar}"
export HADOOP_YARN_HOME="/usr/hdp/current/hadoop-yarn-nodemanager"
export 
CLASSPATH1="$PWD:$PWD/__spark_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure"


# Start validating each variable. We need to have an entry here for each export 
in launch_container.sh
echo "Start to validate exported environment variables:"
verify_export_variable NM_PORT $NM_PORT
verify_export_variable SPARK_YARN_CACHE_FILES_TIME_STAMPS 
$SPARK_YARN_CACHE_FILES_TIME_STAMPS
verify_export_variable USER1 $USER1
verify_export_variable CLASSPATH1 $CLASSPATH1
verify_export_variable HADOOP_YARN_HOME $HADOOP_YARN_HOME
{noformat}


A possible o/p is like below
{noformat}
BB-f45c89c9cb29:temp sunilg$ ./tmp.sh
./tmp.sh: line 11: user1${foo.bar}: bad substitution
./tmp.sh: line 13: 
$PWD:$PWD/__spark_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure:
 bad substitution
Start to validate exported environment variables:
Variable NM_PORT to be defined as 45454
Variable SPARK_YARN_CACHE_FILES_TIME_STAMPS to be defined as 1461794083704
./tmp.sh: line 6: var: parameter null or not set
{noformat}

*USER1* and *CLASPATH1* is having wrong substitutions, and we validate *USER1* 
first. So script will stop there.
*./tmp.sh: line 6: var: parameter null or not set* error stops the script since 
it saw an error. Even though *export* has printed some error in console, the 
script was still continued.

Thoughts?

> When an export var command fails in launch_container.sh, the full container 
> launch should fail
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-5219
>                 URL: https://issues.apache.org/jira/browse/YARN-5219
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Hitesh Shah
>            Assignee: Sunil G
>
> Today, a container fails if certain files fail to localize. However, if 
> certain env vars fail to get setup properly either due to bugs in the yarn 
> application or misconfiguration, the actual process launch still gets 
> triggered. This results in either confusing error messages if the process 
> fails to launch or worse yet the process launches but then starts behaving 
> wrongly if the env var is used to control some behavioral aspects. 
> In this scenario, the issue was reproduced by trying to do export 
> abc="$\{foo.bar}" which is invalid as var names cannot contain "." in bash. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to