[ https://issues.apache.org/jira/browse/SPARK-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Or updated SPARK-1900: ----------------------------- Component/s: PySpark > Fix running PySpark files on YARN > ---------------------------------- > > Key: SPARK-1900 > URL: https://issues.apache.org/jira/browse/SPARK-1900 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.0.0 > Reporter: Andrew Or > Priority: Blocker > Fix For: 1.0.0 > > > This fails currently because of a mismatch in paths. > On a YARN cluster, spark-submit automatically assumes the file is on HDFS, > even if it is a relative path that refers to a local file. A natural > workaround for this is to explicitly specify the "file:" prefix. However, > this prefix is not understood by python, which fails with the following: > {code} > python: can't open file 'file:path/to/my/file.py': [Errno 2] No such file or > directory > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)