[ https://issues.apache.org/jira/browse/SPARK-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Taeyun Kim updated SPARK-1825: ------------------------------ Fix Version/s: 1.0.0 > Windows Spark fails to work with Linux YARN > ------------------------------------------- > > Key: SPARK-1825 > URL: https://issues.apache.org/jira/browse/SPARK-1825 > Project: Spark > Issue Type: Bug > Affects Versions: 1.0.0 > Reporter: Taeyun Kim > Fix For: 1.0.0 > > > Windows Spark fails to work with Linux YARN. > This is a cross-platform problem. > On YARN side, Hadoop 2.4.0 resolved the issue as follows: > https://issues.apache.org/jira/browse/YARN-1824 > But Spark YARN module does not incorporate the new YARN API yet, so problem > persists for Spark. > First, the following source files should be changed: > - /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala > - > /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala > Change is as follows: > - Replace .$() to .$$() > - Replace File.pathSeparator for Environment.CLASSPATH.name to > ApplicationConstants.CLASS_PATH_SEPARATOR (import > org.apache.hadoop.yarn.api.ApplicationConstants is required for this) > Unless the above are applied, launch_container.sh will contain invalid shell > script statements(since they will contain Windows-specific separators), and > job will fail. > Also, the following symptom should also be fixed (I could not find the > relevant source code): > - SPARK_HOME environment variable is copied straight to launch_container.sh. > It should be changed to the path format for the server OS, or, the better, a > separate environment variable or a configuration variable should be created. > - '%HADOOP_MAPRED_HOME%' string still exists in launch_container.sh, after > the above change is applied. maybe I missed a few lines. > I'm not sure whether this is all, since I'm new to both Spark and YARN. -- This message was sent by Atlassian JIRA (v6.2#6252)