Re: Configration Problem? (need help to get Spark job executed)

2015-02-17 Thread Arush Kharbanda
Hi

It could be due to the connectivity issue between the master and the slaves.

I have seen this issue occur for the following reasons.Are the slaves
visible in the Spark UI?And how much memory is allocated to the executors.

1. Syncing of configuration between Spark Master and Slaves.
2. Network connectivity issues between the master and slave.

Thanks
Arush

On Sat, Feb 14, 2015 at 3:07 PM, NORD SC jan.algermis...@nordsc.com wrote:

 Hi all,

 I am new to spark and seem to have hit a common newbie obstacle.

 I have a pretty simple setup and job but I am unable to get past this
 error when executing a job:

 TaskSchedulerImpl: Initial job has not accepted any resources; check your
 cluster UI to ensure that workers are registered and have sufficient memory”

 I have so far gained a basic understanding of worker/executor/driver
 memory, but have run out of ideas what to try next - maybe someone has a
 clue.


 My setup:

 Three node standalone cluster with C* and spark on each node and the
 Datastax C*/Spark connector JAR placed on each node.

 On the master I have the slaves configured in conf/slaves and I am using
 sbin/start-all.sh to start the whole cluster.

 On each node I have this in conf/spark-defauls.conf

 spark.masterspark://devpeng-db-cassandra-1:7077
 spark.eventLog.enabled   true
 spark.serializer org.apache.spark.serializer.KryoSerializer

 spark.executor.extraClassPath
 /opt/spark-cassandra-connector-assembly-1.2.0-alpha1.jar

 and this in conf/spart-env.sh

 SPARK_WORKER_MEMORY=6g



 My App looks like this

 object TestApp extends App {
   val conf = new SparkConf(true).set(spark.cassandra.connection.host,
 devpeng-db-cassandra-1.)
   val sc = new SparkContext(spark://devpeng-db-cassandra-1:7077,
 testApp, conf)
   val rdd = sc.cassandraTable(test, kv)
   println(“Count: “ + String.valueOf(rdd.count) )
   println(rdd.first)
 }

 Any kind of idea what to check next would help me at this point, I think.

 Jan

 Log of the application start:

 [info] Loading project definition from
 /Users/jan/projects/gkh/jump/workspace/gkh-spark-example/project
 [info] Set current project to csconnect (in build
 file:/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/)
 [info] Compiling 1 Scala source to
 /Users/jan/projects/gkh/jump/workspace/gkh-spark-example/target/scala-2.10/classes...
 [info] Running jump.TestApp
 Using Spark's default log4j profile:
 org/apache/spark/log4j-defaults.properties
 15/02/14 10:30:11 INFO SecurityManager: Changing view acls to: jan
 15/02/14 10:30:11 INFO SecurityManager: Changing modify acls to: jan
 15/02/14 10:30:11 INFO SecurityManager: SecurityManager: authentication
 disabled; ui acls disabled; users with view permissions: Set(jan); users
 with modify permissions: Set(jan)
 15/02/14 10:30:11 INFO Slf4jLogger: Slf4jLogger started
 15/02/14 10:30:11 INFO Remoting: Starting remoting
 15/02/14 10:30:12 INFO Remoting: Remoting started; listening on addresses
 :[akka.tcp://sparkDriver@xx:58197]
 15/02/14 10:30:12 INFO Utils: Successfully started service 'sparkDriver'
 on port 58197.
 15/02/14 10:30:12 INFO SparkEnv: Registering MapOutputTracker
 15/02/14 10:30:12 INFO SparkEnv: Registering BlockManagerMaster
 15/02/14 10:30:12 INFO DiskBlockManager: Created local directory at
 /var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-local-20150214103012-5b53
 15/02/14 10:30:12 INFO MemoryStore: MemoryStore started with capacity
 530.3 MB
 2015-02-14 10:30:12.304 java[24999:3b07] Unable to load realm info from
 SCDynamicStore
 15/02/14 10:30:12 WARN NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable
 15/02/14 10:30:12 INFO HttpFileServer: HTTP File server directory is
 /var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-48459a22-c1ff-42d5-8b8e-cc89fe84933d
 15/02/14 10:30:12 INFO HttpServer: Starting HTTP Server
 15/02/14 10:30:12 INFO Utils: Successfully started service 'HTTP file
 server' on port 58198.
 15/02/14 10:30:12 INFO Utils: Successfully started service 'SparkUI' on
 port 4040.
 15/02/14 10:30:12 INFO SparkUI: Started SparkUI at http://xx:4040
 15/02/14 10:30:12 INFO AppClient$ClientActor: Connecting to master
 spark://devpeng-db-cassandra-1:7077...
 15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Connected to Spark
 cluster with app ID app-20150214103013-0001
 15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added:
 app-20150214103013-0001/0 on
 worker-20150214102534-devpeng-db-cassandra-2.devpeng
 (devpeng-db-cassandra-2.devpeng.x:57563) with 8 cores
 15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Granted executor ID
 app-20150214103013-0001/0 on hostPort
 devpeng-db-cassandra-2.devpeng.:57563 with 8 cores, 512.0 MB RAM
 15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added:
 app-20150214103013-0001/1 on
 worker-20150214102534-devpeng-db-cassandra-3.devpeng.-38773
 

Configration Problem? (need help to get Spark job executed)

2015-02-14 Thread NORD SC
Hi all,

I am new to spark and seem to have hit a common newbie obstacle.

I have a pretty simple setup and job but I am unable to get past this error 
when executing a job:

TaskSchedulerImpl: Initial job has not accepted any resources; check your 
cluster UI to ensure that workers are registered and have sufficient memory”

I have so far gained a basic understanding of worker/executor/driver memory, 
but have run out of ideas what to try next - maybe someone has a clue.


My setup:

Three node standalone cluster with C* and spark on each node and the Datastax 
C*/Spark connector JAR placed on each node.

On the master I have the slaves configured in conf/slaves and I am using 
sbin/start-all.sh to start the whole cluster.

On each node I have this in conf/spark-defauls.conf

spark.masterspark://devpeng-db-cassandra-1:7077
spark.eventLog.enabled   true
spark.serializer org.apache.spark.serializer.KryoSerializer

spark.executor.extraClassPath  
/opt/spark-cassandra-connector-assembly-1.2.0-alpha1.jar

and this in conf/spart-env.sh

SPARK_WORKER_MEMORY=6g



My App looks like this

object TestApp extends App {
  val conf = new SparkConf(true).set(spark.cassandra.connection.host, 
devpeng-db-cassandra-1.)
  val sc = new SparkContext(spark://devpeng-db-cassandra-1:7077, testApp, 
conf)
  val rdd = sc.cassandraTable(test, kv)
  println(“Count: “ + String.valueOf(rdd.count) )
  println(rdd.first)
}

Any kind of idea what to check next would help me at this point, I think.

Jan

Log of the application start:

[info] Loading project definition from 
/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/project
[info] Set current project to csconnect (in build 
file:/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/)
[info] Compiling 1 Scala source to 
/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/target/scala-2.10/classes...
[info] Running jump.TestApp 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/02/14 10:30:11 INFO SecurityManager: Changing view acls to: jan
15/02/14 10:30:11 INFO SecurityManager: Changing modify acls to: jan
15/02/14 10:30:11 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(jan); users with 
modify permissions: Set(jan)
15/02/14 10:30:11 INFO Slf4jLogger: Slf4jLogger started
15/02/14 10:30:11 INFO Remoting: Starting remoting
15/02/14 10:30:12 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkDriver@xx:58197]
15/02/14 10:30:12 INFO Utils: Successfully started service 'sparkDriver' on 
port 58197.
15/02/14 10:30:12 INFO SparkEnv: Registering MapOutputTracker
15/02/14 10:30:12 INFO SparkEnv: Registering BlockManagerMaster
15/02/14 10:30:12 INFO DiskBlockManager: Created local directory at 
/var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-local-20150214103012-5b53
15/02/14 10:30:12 INFO MemoryStore: MemoryStore started with capacity 530.3 MB
2015-02-14 10:30:12.304 java[24999:3b07] Unable to load realm info from 
SCDynamicStore
15/02/14 10:30:12 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
15/02/14 10:30:12 INFO HttpFileServer: HTTP File server directory is 
/var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-48459a22-c1ff-42d5-8b8e-cc89fe84933d
15/02/14 10:30:12 INFO HttpServer: Starting HTTP Server
15/02/14 10:30:12 INFO Utils: Successfully started service 'HTTP file server' 
on port 58198.
15/02/14 10:30:12 INFO Utils: Successfully started service 'SparkUI' on port 
4040.
15/02/14 10:30:12 INFO SparkUI: Started SparkUI at http://xx:4040
15/02/14 10:30:12 INFO AppClient$ClientActor: Connecting to master 
spark://devpeng-db-cassandra-1:7077...
15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Connected to Spark cluster 
with app ID app-20150214103013-0001
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added: 
app-20150214103013-0001/0 on 
worker-20150214102534-devpeng-db-cassandra-2.devpeng 
(devpeng-db-cassandra-2.devpeng.x:57563) with 8 cores
15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150214103013-0001/0 on hostPort devpeng-db-cassandra-2.devpeng.:57563 
with 8 cores, 512.0 MB RAM
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added: 
app-20150214103013-0001/1 on 
worker-20150214102534-devpeng-db-cassandra-3.devpeng.-38773 
(devpeng-db-cassandra-3.devpeng.xx:38773) with 8 cores
15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150214103013-0001/1 on hostPort 
devpeng-db-cassandra-3.devpeng.xe:38773 with 8 cores, 512.0 MB RAM
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor updated: 
app-20150214103013-0001/0 is now LOADING
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor updated: 
app-20150214103013-0001/1 is now LOADING
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor updated: