Submitting Spark Applications - Do I need to leave ports open?

2015-10-26 Thread markluk
I want to submit interactive applications to a remote Spark cluster running in standalone mode. I understand I need to connect to master's 7077 port. It also seems like the master node need to open connections to my local machine. And the ports that it needs to open are different every time.

Spark cluster - use machine name in WorkerID, not IP address

2015-10-01 Thread markluk
I'm running a standalone Spark cluster of 1 master and 2 slaves. My slaves file under /conf list the fully qualified domain name of the 2 slave machines When I look on the Spark webpage ( on :8080), I see my 2 workers, but the worker ID uses the IP address , like

Worker node timeout exception

2015-09-30 Thread markluk
I setup a new Spark cluster. My worker node is dying with the following exception. Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at

Get variable into Spark's foreachRDD function

2015-09-28 Thread markluk
I have a streaming Spark process and I need to do some logging in the `foreachRDD` function, but I'm having trouble accessing the logger as a variable in the `foreachRDD` function I would like to do the following import logging myLogger = logging.getLogger(LOGGER_NAME) ... ...