Spark crashes with Filesystem recovery

Imran Akbar Tue, 10 May 2016 12:53:40 -0700

I have some Python code that consistently ends up in this state:

ERROR:py4j.java_gateway:An error occurred while trying to connect to the
Java server
Traceback (most recent call last):
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the
Java server
Traceback (most recent call last):
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/lib/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/home/ubuntu/spark/python/pyspark/sql/dataframe.py", line 280, in
collect
    port = self._jdf.collectToPython()
  File "/home/ubuntu/spark/python/pyspark/traceback_utils.py", line 78, in
__exit__
    self._context._jsc.setCallSite(None)
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
811, in __call__
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
624, in send_command
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
579, in _get_connection
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
585, in _create_connection
  File
"/home/ubuntu/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line
697, in start
py4j.protocol.Py4JNetworkError: An error occurred while trying to connect
to the Java server


Even though I start pyspark with these options:
./pyspark --master local[4] --executor-memory 14g --driver-memory 14g
--packages com.databricks:spark-csv_2.11:1.4.0
--spark.deploy.recoveryMode=FILESYSTEM

and this in my /conf/spark-env.sh file:
- SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=FILESYSTEM
-Dspark.deploy.recoveryDirectory=/user/recovery"

How can I get HA to work in Spark?

thanks,
imran

Spark crashes with Filesystem recovery

Reply via email to