Andrew Or created SPARK-1850:
--------------------------------
Summary: Bad exception if multiple jars exist when running PySpark
Key: SPARK-1850
URL: https://issues.apache.org/jira/browse/SPARK-1850
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 1.0.0
Reporter: Andrew Or
Fix For: 1.0.1
Found multiple Spark assembly jars in
/Users/andrew/Documents/dev/andrew-spark/assembly/target/scala-2.10:
Traceback (most recent call last):
File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/shell.py", line
43, in <module>
sc = SparkContext(os.environ.get("MASTER", "local[*]"), "PySparkShell",
pyFiles=add_files)
File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py",
line 94, in __init__
SparkContext._ensure_initialized(self, gateway=gateway)
File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py",
line 180, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File
"/Users/andrew/Documents/dev/andrew-spark/python/pyspark/java_gateway.py", line
49, in launch_gateway
gateway_port = int(proc.stdout.readline())
ValueError: invalid literal for int() with base 10:
'spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4-deps.jar\n'
It's trying to read the Java gateway port as an int from the sub-process'
STDOUT. However, what it read was an error message, which is clearly not an
int. We should differentiate between these cases and just propagate the
original message if it's not an int. Right now, this exception is not very
helpful.
--
This message was sent by Atlassian JIRA
(v6.2#6252)