[ 
https://issues.apache.org/jira/browse/YARN-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457139#comment-16457139
 ] 

Eric Yang commented on YARN-8207:
---------------------------------

[~jlowe] Thank you for the review.  Good suggestions on coding style issues.  I 
will fix the coding style issues.

{quote}
stderr.txt is fopen'd and never used before the fclose? Are we supposed to dup2 
these file descriptors to 1 and 2 before the execv so any errors from docker 
run appear in those output files?{quote}

When using launch_script.sh, there is stdout and stderr redirection inside 
launch_script.sh which bind-mount to host log directory.  This is the reason 
that there is fopen and fclosed immediately until YARN-7654 logic are added.

{quote}The parent process that is responsible for obtaining the pid is not 
waiting for the child to complete before running the inspect command. That's 
why retries had to be added to get it to work when they were not needed before. 
The parent should simply wait and check for error exit codes as it did before 
when it was using popen. After that we can ditch the retries since they won't 
be necessary.{quote}

Using launch_script.sh, container-executor runs "docker run" with detach 
option.  It assumes the exit code can be obtained quickly.  This is the reason 
there is no logic for retry "docker inspect".  This assumption is some what 
flawed.  If the docker image is unavailable on the host, docker will show 
download progress and some other information and errors.  The progression are 
not captured, which is difficult to debug.  When docker inspect is probed, 
there is no information of what failed.

Without launch_script.sh, container-executor runs "docker run" in the 
foreground, and obtain pid when the first process is started.  Inspect command 
is checked asynchronously because docker run exit code is only reported when 
the docker process is terminated.  There is a balance between how long that we 
should wait before we decide if the system is hang.  We can make MAX_RETRIES 
configurable in case people like to wait for longer or period of time before 
deciding if the container should fail.

{quote}Why does make_string calculate size = n + 2 instead of n + 1?{quote}

This change make make_string function twice faster than sample code while waste 
1% or less space if recursion is required.  It is probably a reasonable trade 
off for modern day computers.


> Docker container launch use popen have risk of shell expansion
> --------------------------------------------------------------
>
>                 Key: YARN-8207
>                 URL: https://issues.apache.org/jira/browse/YARN-8207
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn-native-services
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: YARN-8207.001.patch
>
>
> Container-executor code utilize a string buffer to construct docker run 
> command, and pass the string buffer to popen for execution.  Popen spawn a 
> shell to run the command.  Some arguments for docker run are still vulnerable 
> to shell expansion.  The possible solution is to convert from char * buffer 
> to string array for execv to avoid shell expansion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to