Hi

I have a spark/cassandra setup where I am using a spark cassandra java connector to query on a table. So far, I have 1 spark master node (2 cores) and 1 worker node (4 cores). Both of them have following spark-env.sh under conf/:

   |#!/usr/bin/env bash
   export SPARK_LOCAL_IP=127.0.0.1
   export SPARK_MASTER_IP="192.168.4.134"
   export SPARK_WORKER_MEMORY=1G
   export SPARK_EXECUTOR_MEMORY=2G

   |

I am using spark 1.4.1 along with cassandra 2.2.0. I have started my cassandra/spark setup. Created keyspace and table under cassandra and added some rows on table. Now I try to run following spark job using spark cassandra java connector:

|     SparkConf conf = new SparkConf();
    conf.setAppName("Testing");
    conf.setMaster("spark://192.168.4.134:7077");
    conf.set("spark.cassandra.connection.host", "192.168.4.129");
    conf.set("spark.logConf", "true");
    conf.set("spark.driver.maxResultSize", "50m");
    conf.set("spark.executor.memory", "200m");
    conf.set("spark.eventLog.enabled", "true");
    conf.set("spark.eventLog.dir", "/tmp/");
    conf.set("spark.executor.extraClassPath", "/home/enlighted/ebd.jar");
    conf.set("spark.cores.max", "1");
    JavaSparkContext sc = new JavaSparkContext(conf);


    JavaRDD<String> cassandraRowsRDD = 
CassandraJavaUtil.javaFunctions(sc).cassandraTable("testing", "ec")
    .map(new Function<CassandraRow, String>() {
        private static final long serialVersionUID = -6263533266898869895L;
        @Override
        public String call(CassandraRow cassandraRow) throws Exception {
            return cassandraRow.toString();
        }
    });
    System.out.println("Data as CassandraRows: \n" + 
StringUtils.join(cassandraRowsRDD.toArray(), "\n"));
    sc.close();|



This job is stuck with insufficient resources warning. Here are logs:

   1107 [main] INFO org.apache.spark.SparkContext  - Spark configuration:
   spark.app.name=Testing
   spark.cassandra.connection.host=192.168.4.129
   spark.cores.max=1
   spark.driver.maxResultSize=50m
   spark.eventLog.dir=/tmp/
   spark.eventLog.enabled=true
   spark.executor.extraClassPath=/home/enlighted/ebd.jar
   spark.executor.memory=200m
   spark.logConf=true
   spark.master=spark://192.168.4.134:7077
   1121 [main] INFO org.apache.spark.SecurityManager  - Changing view
   acls to: enlighted
   1122 [main] INFO org.apache.spark.SecurityManager  - Changing modify
   acls to: enlighted
   1123 [main] INFO org.apache.spark.SecurityManager  -
   SecurityManager: authentication disabled; ui acls disabled; users
   with view permissions: Set(enlighted); users with modify
   permissions: Set(enlighted)
   1767 [sparkDriver-akka.actor.default-dispatcher-4] INFO
   akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
   1805 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting -
   Starting remoting
   1957 [main] INFO org.apache.spark.util.Utils  - Successfully started
   service 'sparkDriver' on port 54611.
   1958 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting -
   Remoting started; listening on addresses
   :[akka.tcp://sparkDriver@192.168.4.134:54611]
   1977 [main] INFO org.apache.spark.SparkEnv  - Registering
   MapOutputTracker
   1989 [main] INFO org.apache.spark.SparkEnv  - Registering
   BlockManagerMaster
   2007 [main] INFO org.apache.spark.storage.DiskBlockManager  -
   Created local directory at
   
/tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/blockmgr-3e3d54e7-16df-4e97-be48-b0c0fa0389e7
   2012 [main] INFO org.apache.spark.storage.MemoryStore  - MemoryStore
   started with capacity 456.0 MB
   2044 [main] INFO org.apache.spark.HttpFileServer  - HTTP File server
   directory is
   
/tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/httpd-64b4d92e-cde9-45fb-8b38-edc3cca3933c
   2046 [main] INFO org.apache.spark.HttpServer  - Starting HTTP Server
   2086 [main] INFO org.spark-project.jetty.server.Server  -
   jetty-8.y.z-SNAPSHOT
   2098 [main] INFO org.spark-project.jetty.server.AbstractConnector -
   Started SocketConnector@0.0.0.0:44884
   2099 [main] INFO org.apache.spark.util.Utils  - Successfully started
   service 'HTTP file server' on port 44884.
   2108 [main] INFO org.apache.spark.SparkEnv  - Registering
   OutputCommitCoordinator
   2297 [main] INFO org.spark-project.jetty.server.Server  -
   jetty-8.y.z-SNAPSHOT
   2317 [main] INFO org.spark-project.jetty.server.AbstractConnector -
   Started SelectChannelConnector@0.0.0.0:4040
   2318 [main] INFO org.apache.spark.util.Utils  - Successfully started
   service 'SparkUI' on port 4040.
   2320 [main] INFO org.apache.spark.ui.SparkUI  - Started SparkUI at
   http://192.168.4.134:4040
   2387 [sparkDriver-akka.actor.default-dispatcher-3] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Connecting
   to master akka.tcp://sparkMaster@192.168.4.134:7077/user/Master...
   2662 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
   Connected to Spark cluster with app ID app-20150806054450-0001
   2680 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   added: app-20150806054450-0001/0 on
   worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566) with
   1 cores
   2682 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
   Granted executor ID app-20150806054450-0001/0 on hostPort
   192.168.4.129:45566 with 1 cores, 200.0 MB RAM
   2696 [main] INFO org.apache.spark.util.Utils  - Successfully started
   service 'org.apache.spark.network.netty.NettyBlockTransferService'
   on port 49150.
   2696 [main] INFO
   org.apache.spark.network.netty.NettyBlockTransferService  - Server
   created on 49150
   2700 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   updated: app-20150806054450-0001/0 is now LOADING
   2706 [main] INFO org.apache.spark.storage.BlockManagerMaster  -
   Trying to register BlockManager
   2708 [sparkDriver-akka.actor.default-dispatcher-17] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   updated: app-20150806054450-0001/0 is now RUNNING
   2710 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.storage.BlockManagerMasterEndpoint  - Registering
   block manager 192.168.4.134:49150 with 456.0 MB RAM,
   BlockManagerId(driver, 192.168.4.134, 49150)
   2713 [main] INFO org.apache.spark.storage.BlockManagerMaster  -
   Registered BlockManager
   2922 [main] INFO org.apache.spark.scheduler.EventLoggingListener -
   Logging events to file:/tmp/app-20150806054450-0001
   2939 [main] INFO
   org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
   SchedulerBackend is ready for scheduling beginning after reached
   minRegisteredResourcesRatio: 0.0
   3321 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
   host /192.168.4.129:9042 added
   3321 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
   host /192.168.4.130:9042 added
   3322 [main] INFO
   com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
   Added host 192.168.4.130 (DC1)
   3322 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
   host /192.168.4.131:9042 added
   3323 [main] INFO
   com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
   Added host 192.168.4.131 (DC1)
   3323 [main] INFO com.datastax.driver.core.Cluster  - New Cassandra
   host /192.168.4.132:9042 added
   3323 [main] INFO
   com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
   Added host 192.168.4.132 (DC1)
   3325 [main] INFO
   com.datastax.spark.connector.cql.CassandraConnector  - Connected to
   Cassandra cluster: enldbcluster
   3881 [main] INFO org.apache.spark.SparkContext  - Starting job:
   toArray at Start.java:85
   3898 [pool-18-thread-1] INFO
   com.datastax.spark.connector.cql.CassandraConnector  - Disconnected
   from Cassandra cluster: enldbcluster
   3901 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.DAGScheduler  - Got job 0 (toArray at
   Start.java:85) with 6 output partitions (allowLocal=false)
   3902 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.DAGScheduler  - Final stage: ResultStage
   0(toArray at Start.java:85)
   3902 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.DAGScheduler  - Parents of final stage:
   List()
   3908 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.DAGScheduler  - Missing parents: List()
   3925 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.DAGScheduler  - Submitting ResultStage 0
   (MapPartitionsRDD[1] at map at Start.java:77), which has no missing
   parents
   4002 [dag-scheduler-event-loop] INFO
   org.apache.spark.storage.MemoryStore  - ensureFreeSpace(7488) called
   with curMem=0, maxMem=478182113
   4004 [dag-scheduler-event-loop] INFO
   org.apache.spark.storage.MemoryStore  - Block broadcast_0 stored as
   values in memory (estimated size 7.3 KB, free 456.0 MB)
   4013 [dag-scheduler-event-loop] INFO
   org.apache.spark.storage.MemoryStore  - ensureFreeSpace(4015) called
   with curMem=7488, maxMem=478182113
   4013 [dag-scheduler-event-loop] INFO
   org.apache.spark.storage.MemoryStore  - Block broadcast_0_piece0
   stored as bytes in memory (estimated size 3.9 KB, free 456.0 MB)
   4015 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.storage.BlockManagerInfo  - Added
   broadcast_0_piece0 in memory on 192.168.4.134:49150 (size: 3.9 KB,
   free: 456.0 MB)
4017 [dag-scheduler-event-loop] INFO org.apache.spark.SparkContext - Created broadcast 0 from broadcast at DAGScheduler.scala:874
   4089 [dag-scheduler-event-loop] INFO
   com.datastax.driver.core.Cluster  - New Cassandra host
   /192.168.4.129:9042 added
   4089 [dag-scheduler-event-loop] INFO
   com.datastax.driver.core.Cluster  - New Cassandra host
   /192.168.4.130:9042 added
   4089 [dag-scheduler-event-loop] INFO
   com.datastax.driver.core.Cluster  - New Cassandra host
   /192.168.4.131:9042 added
   4089 [dag-scheduler-event-loop] INFO
   com.datastax.driver.core.Cluster  - New Cassandra host
   /192.168.4.132:9042 added
   4089 [dag-scheduler-event-loop] INFO
   com.datastax.spark.connector.cql.CassandraConnector  - Connected to
   Cassandra cluster: enldbcluster
   4394 [pool-18-thread-1] INFO
   com.datastax.spark.connector.cql.CassandraConnector  - Disconnected
   from Cassandra cluster: enldbcluster
   4806 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.DAGScheduler  - Submitting 6 missing
   tasks from ResultStage 0 (MapPartitionsRDD[1] at map at Start.java:77)
   4807 [dag-scheduler-event-loop] INFO
   org.apache.spark.scheduler.TaskSchedulerImpl  - Adding task set 0.0
   with 6 tasks
   19822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
   Initial job has not accepted any resources; check your cluster UI to
   ensure that workers are registered and have sufficient resources
   34822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
   Initial job has not accepted any resources; check your cluster UI to
   ensure that workers are registered and have sufficient resources
   49822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
   Initial job has not accepted any resources; check your cluster UI to
   ensure that workers are registered and have sufficient resources
   64822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
   Initial job has not accepted any resources; check your cluster UI to
   ensure that workers are registered and have sufficient resources
   79822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
   Initial job has not accepted any resources; check your cluster UI to
   ensure that workers are registered and have sufficient resources
   94822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
   Initial job has not accepted any resources; check your cluster UI to
   ensure that workers are registered and have sufficient resources
109822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources
124822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources
   124963 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   updated: app-20150806054450-0001/0 is now EXITED (Command exited
   with code 1)
   124964 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
   Executor app-20150806054450-0001/0 removed: Command exited with code 1
   124968 [sparkDriver-akka.actor.default-dispatcher-17] ERROR
   org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
   Asked to remove non-existent executor 0
   124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   added: app-20150806054450-0001/1 on
   worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566) with
   1 cores
   124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend  -
   Granted executor ID app-20150806054450-0001/1 on hostPort
   192.168.4.129:45566 with 1 cores, 200.0 MB RAM
   124975 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   updated: app-20150806054450-0001/1 is now RUNNING
   125012 [sparkDriver-akka.actor.default-dispatcher-14] INFO
   org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
   updated: app-20150806054450-0001/1 is now LOADING
139822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources
154822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources
169823 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources
184822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources
199822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl - Initial job has not accepted any resources; check your cluster UI
   to ensure that workers are registered and have sufficient resources

||

Please find attached the spark master UI and pom.xml file with dependencies.

Can anyone please point out what could be an issue here.



<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
         http://maven.apache.org/maven-v4_0_0.xsd";>
	<modelVersion>4.0.0</modelVersion>
	<groupId>enlighted</groupId>
	<artifactId>ebd</artifactId>
	<packaging>jar</packaging>
	<version>1.0</version>
	<name>ebd</name>

	<properties>
		<slf4j.version>1.7.12</slf4j.version>
		<maven.jar.plugin.version>2.6</maven.jar.plugin.version>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
		<project.implementation.title>ebd</project.implementation.title>
		<project.implementation.vendor>Enlighted Inc</project.implementation.vendor>
		<org.springframework.version>4.2.0.RELEASE</org.springframework.version>
	</properties>

	<dependencies>
		<dependency>
			<groupId>org.slf4j</groupId>
			<artifactId>slf4j-log4j12</artifactId>
			<version>${slf4j.version}</version>
		</dependency>
		<dependency>
			<groupId>org.springframework</groupId>
			<artifactId>spring-core</artifactId>
			<version>${org.springframework.version}</version>
		</dependency>
		<dependency>
			<groupId>commons-io</groupId>
			<artifactId>commons-io</artifactId>
			<version>2.4</version>
		</dependency>
		<dependency>
			<groupId>com.datastax.cassandra</groupId>
			<artifactId>cassandra-driver-core</artifactId>
			<version>2.1.7.1</version>
		</dependency>
		<dependency>
			<groupId>org.apache.spark</groupId>
			<artifactId>spark-core_2.10</artifactId>
			<version>1.4.1</version>
		</dependency>
		<dependency>
			<groupId>org.apache.spark</groupId>
			<artifactId>spark-streaming_2.10</artifactId>
			<version>1.4.1</version>
		</dependency>
		<dependency>
			<groupId>com.datastax.spark</groupId>
			<artifactId>spark-cassandra-connector_2.10</artifactId>
			<version>1.4.0-M2</version>
		</dependency>
		<dependency>
			<groupId>com.datastax.spark</groupId>
			<artifactId>spark-cassandra-connector-java_2.10</artifactId>
			<version>1.4.0-M2</version>
		</dependency>

		
	</dependencies>
	<build>
		<finalName>ebd</finalName>
		<plugins>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-jar-plugin</artifactId>
				<version>${maven.jar.plugin.version}</version>
				<configuration>
					<archive>
						<manifestSections>
							<manifestSection>
								<name>${project.name}</name>
								<manifestEntries>
									<Implementation-Title>${project.implementation.title}</Implementation-Title>
									<Implementation-Vendor>${project.implementation.vendor}</Implementation-Vendor>
									<Implementation-Version>${project.version}</Implementation-Version>
								</manifestEntries>
							</manifestSection>
						</manifestSections>
					</archive>
				</configuration>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-jar-plugin</artifactId>
				<version>${maven.jar.plugin.version}</version>
				<configuration>
					<archive>
						<manifest>
							<mainClass>com.ebd.Start</mainClass>
						</manifest>
					</archive>
				</configuration>
			</plugin>
			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-shade-plugin</artifactId>
				<version>2.4.1</version>
				<executions>
					<execution>
						<phase>package</phase>
						<goals>
							<goal>shade</goal>
						</goals>
						<configuration>
							<transformers>
								<transformer
									implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
									<resource>META-INF/spring.handlers</resource>
								</transformer>
								<transformer
									implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
									<resource>META-INF/spring.schemas</resource>
								</transformer>
								<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>reference.conf</resource>
                                </transformer>
							</transformers>
							<filters>
					            <filter>
					              <artifact>*:*</artifact>
					              <excludes>
					                <exclude>META-INF/*.SF</exclude>
					                <exclude>META-INF/*.DSA</exclude>
					                <exclude>META-INF/*.RSA</exclude>
					              </excludes>
					            </filter>
				          	</filters>
						</configuration>
					</execution>
				</executions>
			</plugin>

			<plugin>
				<groupId>org.apache.maven.plugins</groupId>
				<artifactId>maven-compiler-plugin</artifactId>
				<version>3.3</version>
				<configuration>
					<source>1.6</source>
					<target>1.6</target>
				</configuration>
			</plugin>
		</plugins>
	</build>
</project>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to