[
https://issues.apache.org/jira/browse/SAMZA-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062248#comment-14062248
]
Chris Riccomini commented on SAMZA-333:
---------------------------------------
My first reaction on how to solve this problem is to attach the config as a
resource for the job (like the .tgz). This would require the config sitting
somewhere (HTTP server, or HDFS), so that the NM can download it and
materialize it to disk. The SamzaContainer and Yarn AM would then have to be
updated to use the config factory to read the locally downloaded (via the NM)
config file.
After some thought, though, I'd like to investigate how Spark, Map/Reduce, etc
handle this.
> Large samza configurations results in yarn job failure
> -------------------------------------------------------
>
> Key: SAMZA-333
> URL: https://issues.apache.org/jira/browse/SAMZA-333
> Project: Samza
> Issue Type: Bug
> Components: container
> Reporter: Naveen
>
> {code}
> Application application_1404246879802_0019 failed 50 times due to AM
> Container for appattempt_1404246879802_0019_000050 exited with exitCode: 0
> due to: Exception from container-launch: java.io.IOException: Cannot run
> program "nice" (in directory
> "/export/content/data/samsa-yarn/usercache/samza-job/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
> error=7, Argument list too long
> java.io.IOException: Cannot run program "nice" (in directory
> "/export/content/data/samsa-yarn/usercache/samza/appcache/application_1404246879802_0019/container_1404246879802_0019_50_000001"):
> error=7, Argument list too long
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1042)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:448)
> at org.apache.hadoop.util.Shell.run(Shell.java:418)
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
> at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: error=7, Argument list too long
> at java.lang.UNIXProcess.forkAndExec(Native Method)
> at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1023)
> ... 10 more
> .Failing this attempt.. Failing the application.
> {code}
> This happens because the launch_container.sh script generated by yarn has all
> the export variables (including samza configs) and the run_container scripts,
> and when we export a big config variable it crashes the current shell it's
> running in.
> For e.g., the size of the variable "SAMZA_SYSTEM_STREAMS" from
> launch_container config is:
> {code}
> bash-4.1$ sed '12q;d' launch_container.sh | wc -c
> 167546
> {code}
> As indicated here, http://www.in-ulm.de/~mascheck/various/argmax/
> The maximum size of an argument is bound by MAX_ARG_STRLEN (131072).
> This can be reproduced by exporting a large variable
> {code}
> [nsomasun@eat1-app201 usercache]$ sudo -uapp bash
> bash-4.1$ export b1=A
> bash-4.1$ export b2=$b1$b1
> bash-4.1$ export b4=$b2$b2
> bash-4.1$ export b8=$b4$b4
> bash-4.1$ export b16=$b8$b8
> bash-4.1$ export b32=$b16$b16
> bash-4.1$ export b64=$b32$b32
> bash-4.1$ export b128=$b64$b64
> bash-4.1$ export b256=$b128$b128
> bash-4.1$ export b512=$b256$b256
> bash-4.1$ export b1k=$b512$b512
> bash-4.1$ export b2k=$b1k$b1k
> bash-4.1$ export b4k=$b2k$b2k
> bash-4.1$ export b8k=$b4k$b4k
> bash-4.1$ export b16k=$b8k$b8k
> bash-4.1$ export b32k=$b16k$b16k
> bash-4.1$ export b64k=$b32k$b32k
> bash-4.1$ export b128k=$b64k$b64k
> bash-4.1$ ls
> bash: /bin/ls: Argument list too long
> {code}
> We need alternate mechanisms to pass configurations to the samza container,
> since we bound by the size of the variable the shell can support.
--
This message was sent by Atlassian JIRA
(v6.2#6252)