When you say remote cluster you need to make sure a few things like: - No firewall/network is blocking any connection (Simply ping from localmachine to remote ip and vice versa) - Make sure all ports (unless you specify them manually) are open.
You can also refer this discussion, http://apache-spark-user-list.1001560.n3.nabble.com/Submitting-Spark-job-on-Unix-cluster-from-dev-environment-Windows-td16989.html Hope it helps. Thanks Best Regards On Sun, Jan 25, 2015 at 2:40 AM, Joseph Lust <jl...@mc10inc.com> wrote: > I’ve setup a Spark cluster in the last few weeks and everything is > working, but * I cannot run spark-shell interactively against the cluster > from a remote host* > > - Deploy .jar to cluster from remote (laptop) spark-submit and have it > run – Check > - Run .jar on spark-shell locally – Check > - Run same .jar on spark-shell on master server – Check > - Run spark-shell interactively against cluster on master server – > Check > - Run spark-shell interactively from remote (laptop) against cluster – > *FAIL* > > It seems other people have faced this same issue: > > http://apache-spark-user-list.1001560.n3.nabble.com/spark-shell-working-local-but-not-remote-td19727.html > > I’m getting the same warnings about memory, despite plenty of memory > being available for the job to run (see above working cases) > > "WARN TaskSchedulerImpl: Initial job has not accepted any resources; > check your cluster UI to ensure that workers are registered and have > sufficient memory” > > Some have suggested it has to do with conflicts of Jars on the class > path and that Spark is providing spurious memory error messages while the > problem is really class path conflicts. > > http://apache-spark-user-list.1001560.n3.nabble.com/WARN-ClusterScheduler-Initial-job-has-not-accepted-any-resources-check-your-cluster-UI-to-ensure-thay-td374.html#a396 > > Details: > > - Cluster: 1 master, 3 workers on 4GB/4 core Ubuntu 14.04 LTS > - Local (aka remote laptop) MacBook Pro 10.10.1 > - All running HotSpot Java (build 1.8.0_31-b13 and uild 1.8.0_25-b17) > - All running spark-1.2.0-bin-hadoop2.4 > - Using Standalone cluster manager > > > Cluster UI: > * > > Even when I clamp down to the most restrictive amounts, 1 core, 1 > executor, 128m (of 3G available), it still says I don’t have the resources: > > >>>> Start Console example > $ spark-shell --executor-memory 128m --total-executor-cores 1 > --driver-cores 1 --master spark://XXXX:7077 > > 15/01/24 15:57:29 INFO SparkILoop: Created spark context.. > Spark context available as sc. > > scala> val rdd = sc.parallelize(1 to 1000); > rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at > parallelize at <console>:12 > scala> rdd.count > > 15/01/24 15:58:20 INFO BlockManagerMaster: Updated info of block > broadcast_0_piece0 > 15/01/24 15:58:20 INFO SparkContext: Created broadcast 0 from broadcast at > DAGScheduler.scala:838 > 15/01/24 15:58:20 INFO DAGScheduler: Submitting 2 missing tasks from Stage > 0 (ParallelCollectionRDD[0] at parallelize at <console>:12) > 15/01/24 15:58:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks > 15/01/24 15:58:35 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient memory > >>> End console example > > So, can anyone tell me if remote interactive spark-shell on a Standalone > cluster even works? Thanks for your help. > > Cluster UI below showing job is running on cluster, is using a driver > app and worker, and that there are plenty of cores and GB of memory free. > > > Sincerely, > Joe Lust >