The main issue with running a spark-shell locally is that it orchestrates
the actual computation, so you want it to be "close" to the actual Worker
nodes for latency reasons. Running a spark-shell on EC2 in the same region
as the Spark cluster avoids this problem.

The error you're seeing seems to indicate a different issue. Check the
Master web UI (accessible on port 8080 at the master's IP address) to make
sure that Workers are successfully registered and they have the expected
amount of memory available to Spark. You can also check to see how much
memory your spark-shell is trying to get per executor. A couple common
problems are (1) an abandoned spark-shell is holding onto all of your
cluster's resources or (2) you've manually configured your spark-shell to
try to get more memory than your Workers have available. Both of these
should be visible in the web UI.


On Mon, Nov 18, 2013 at 5:00 PM, Matt Cheah <mch...@palantir.com> wrote:

>  Hi,
>
>  I'm working with an infrastructure that already has its own web server
> set up on EC2. I would like to set up a *separate* spark cluster on EC2
> with the scripts and have the web server submit jobs to this spark cluster.
>
>  Is it possible to do this? I'm getting some errors running the spark
> shell from the spark shell on the web server: "Initial job has not accepted
> any resources; check your cluster UI to ensure that workers are registered
> and have sufficient memory". I have heard that it's not possible for any
> local computer to connect to the spark cluster, but I was wondering if
> other EC2 nodes could have their firewalls configured to allow this.
>
>  We don't want to deploy the web server on the master node of the spark
> cluster.
>
>  Thanks,
>
>  -Matt Cheah
>
>
>

Reply via email to