[ 
https://issues.apache.org/jira/browse/SPARK-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Or updated SPARK-1850:
-----------------------------

    Description: 
<code>
Found multiple Spark assembly jars in 
/Users/andrew/Documents/dev/andrew-spark/assembly/target/scala-2.10:
Traceback (most recent call last):
  File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/shell.py", line 
43, in <module>
    sc = SparkContext(os.environ.get("MASTER", "local[*]"), "PySparkShell", 
pyFiles=add_files)
  File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
line 94, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
line 180, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File 
"/Users/andrew/Documents/dev/andrew-spark/python/pyspark/java_gateway.py", line 
49, in launch_gateway
    gateway_port = int(proc.stdout.readline())
ValueError: invalid literal for int() with base 10: 
'spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4-deps.jar\n'
</code>

It's trying to read the Java gateway port as an int from the sub-process' 
STDOUT. However, what it read was an error message, which is clearly not an 
int. We should differentiate between these cases and just propagate the 
original message if it's not an int. Right now, this exception is not very 
helpful.

  was:
Found multiple Spark assembly jars in 
/Users/andrew/Documents/dev/andrew-spark/assembly/target/scala-2.10:
Traceback (most recent call last):
  File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/shell.py", line 
43, in <module>
    sc = SparkContext(os.environ.get("MASTER", "local[*]"), "PySparkShell", 
pyFiles=add_files)
  File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
line 94, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
line 180, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File 
"/Users/andrew/Documents/dev/andrew-spark/python/pyspark/java_gateway.py", line 
49, in launch_gateway
    gateway_port = int(proc.stdout.readline())
ValueError: invalid literal for int() with base 10: 
'spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4-deps.jar\n'

It's trying to read the Java gateway port as an int from the sub-process' 
STDOUT. However, what it read was an error message, which is clearly not an 
int. We should differentiate between these cases and just propagate the 
original message if it's not an int. Right now, this exception is not very 
helpful.


> Bad exception if multiple jars exist when running PySpark
> ---------------------------------------------------------
>
>                 Key: SPARK-1850
>                 URL: https://issues.apache.org/jira/browse/SPARK-1850
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.0.0
>            Reporter: Andrew Or
>             Fix For: 1.0.1
>
>
> <code>
> Found multiple Spark assembly jars in 
> /Users/andrew/Documents/dev/andrew-spark/assembly/target/scala-2.10:
> Traceback (most recent call last):
>   File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/shell.py", 
> line 43, in <module>
>     sc = SparkContext(os.environ.get("MASTER", "local[*]"), "PySparkShell", 
> pyFiles=add_files)
>   File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
> line 94, in __init__
>     SparkContext._ensure_initialized(self, gateway=gateway)
>   File "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/context.py", 
> line 180, in _ensure_initialized
>     SparkContext._gateway = gateway or launch_gateway()
>   File 
> "/Users/andrew/Documents/dev/andrew-spark/python/pyspark/java_gateway.py", 
> line 49, in launch_gateway
>     gateway_port = int(proc.stdout.readline())
> ValueError: invalid literal for int() with base 10: 
> 'spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4-deps.jar\n'
> </code>
> It's trying to read the Java gateway port as an int from the sub-process' 
> STDOUT. However, what it read was an error message, which is clearly not an 
> int. We should differentiate between these cases and just propagate the 
> original message if it's not an int. Right now, this exception is not very 
> helpful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to