Hi
I have a spark/cassandra setup where I am using a spark cassandra java
connector to query on a table. So far, I have 1 spark master node (2
cores) and 1 worker node (4 cores). Both of them have following
spark-env.sh under conf/:
|#!/usr/bin/env bash
export SPARK_LOCAL_IP=127.0.0.1
export SPARK_MASTER_IP="192.168.4.134"
export SPARK_WORKER_MEMORY=1G
export SPARK_EXECUTOR_MEMORY=2G
|
I am using spark 1.4.1 along with cassandra 2.2.0. I have started my
cassandra/spark setup. Created keyspace and table under cassandra and
added some rows on table. Now I try to run following spark job using
spark cassandra java connector:
| SparkConf conf = new SparkConf();
conf.setAppName("Testing");
conf.setMaster("spark://192.168.4.134:7077");
conf.set("spark.cassandra.connection.host", "192.168.4.129");
conf.set("spark.logConf", "true");
conf.set("spark.driver.maxResultSize", "50m");
conf.set("spark.executor.memory", "200m");
conf.set("spark.eventLog.enabled", "true");
conf.set("spark.eventLog.dir", "/tmp/");
conf.set("spark.executor.extraClassPath", "/home/enlighted/ebd.jar");
conf.set("spark.cores.max", "1");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> cassandraRowsRDD =
CassandraJavaUtil.javaFunctions(sc).cassandraTable("testing", "ec")
.map(new Function<CassandraRow, String>() {
private static final long serialVersionUID = -6263533266898869895L;
@Override
public String call(CassandraRow cassandraRow) throws Exception {
return cassandraRow.toString();
}
});
System.out.println("Data as CassandraRows: \n" +
StringUtils.join(cassandraRowsRDD.toArray(), "\n"));
sc.close();|
This job is stuck with insufficient resources warning. Here are logs:
1107 [main] INFO org.apache.spark.SparkContext - Spark configuration:
spark.app.name=Testing
spark.cassandra.connection.host=192.168.4.129
spark.cores.max=1
spark.driver.maxResultSize=50m
spark.eventLog.dir=/tmp/
spark.eventLog.enabled=true
spark.executor.extraClassPath=/home/enlighted/ebd.jar
spark.executor.memory=200m
spark.logConf=true
spark.master=spark://192.168.4.134:7077
1121 [main] INFO org.apache.spark.SecurityManager - Changing view
acls to: enlighted
1122 [main] INFO org.apache.spark.SecurityManager - Changing modify
acls to: enlighted
1123 [main] INFO org.apache.spark.SecurityManager -
SecurityManager: authentication disabled; ui acls disabled; users
with view permissions: Set(enlighted); users with modify
permissions: Set(enlighted)
1767 [sparkDriver-akka.actor.default-dispatcher-4] INFO
akka.event.slf4j.Slf4jLogger - Slf4jLogger started
1805 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting -
Starting remoting
1957 [main] INFO org.apache.spark.util.Utils - Successfully started
service 'sparkDriver' on port 54611.
1958 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting -
Remoting started; listening on addresses
:[akka.tcp://sparkDriver@192.168.4.134:54611]
1977 [main] INFO org.apache.spark.SparkEnv - Registering
MapOutputTracker
1989 [main] INFO org.apache.spark.SparkEnv - Registering
BlockManagerMaster
2007 [main] INFO org.apache.spark.storage.DiskBlockManager -
Created local directory at
/tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/blockmgr-3e3d54e7-16df-4e97-be48-b0c0fa0389e7
2012 [main] INFO org.apache.spark.storage.MemoryStore - MemoryStore
started with capacity 456.0 MB
2044 [main] INFO org.apache.spark.HttpFileServer - HTTP File server
directory is
/tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/httpd-64b4d92e-cde9-45fb-8b38-edc3cca3933c
2046 [main] INFO org.apache.spark.HttpServer - Starting HTTP Server
2086 [main] INFO org.spark-project.jetty.server.Server -
jetty-8.y.z-SNAPSHOT
2098 [main] INFO org.spark-project.jetty.server.AbstractConnector -
Started SocketConnector@0.0.0.0:44884
2099 [main] INFO org.apache.spark.util.Utils - Successfully started
service 'HTTP file server' on port 44884.
2108 [main] INFO org.apache.spark.SparkEnv - Registering
OutputCommitCoordinator
2297 [main] INFO org.spark-project.jetty.server.Server -
jetty-8.y.z-SNAPSHOT
2317 [main] INFO org.spark-project.jetty.server.AbstractConnector -
Started SelectChannelConnector@0.0.0.0:4040
2318 [main] INFO org.apache.spark.util.Utils - Successfully started
service 'SparkUI' on port 4040.
2320 [main] INFO org.apache.spark.ui.SparkUI - Started SparkUI at
http://192.168.4.134:4040
2387 [sparkDriver-akka.actor.default-dispatcher-3] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Connecting
to master akka.tcp://sparkMaster@192.168.4.134:7077/user/Master...
2662 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
Connected to Spark cluster with app ID app-20150806054450-0001
2680 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
added: app-20150806054450-0001/0 on
worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566) with
1 cores
2682 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
Granted executor ID app-20150806054450-0001/0 on hostPort
192.168.4.129:45566 with 1 cores, 200.0 MB RAM
2696 [main] INFO org.apache.spark.util.Utils - Successfully started
service 'org.apache.spark.network.netty.NettyBlockTransferService'
on port 49150.
2696 [main] INFO
org.apache.spark.network.netty.NettyBlockTransferService - Server
created on 49150
2700 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
updated: app-20150806054450-0001/0 is now LOADING
2706 [main] INFO org.apache.spark.storage.BlockManagerMaster -
Trying to register BlockManager
2708 [sparkDriver-akka.actor.default-dispatcher-17] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
updated: app-20150806054450-0001/0 is now RUNNING
2710 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.storage.BlockManagerMasterEndpoint - Registering
block manager 192.168.4.134:49150 with 456.0 MB RAM,
BlockManagerId(driver, 192.168.4.134, 49150)
2713 [main] INFO org.apache.spark.storage.BlockManagerMaster -
Registered BlockManager
2922 [main] INFO org.apache.spark.scheduler.EventLoggingListener -
Logging events to file:/tmp/app-20150806054450-0001
2939 [main] INFO
org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
SchedulerBackend is ready for scheduling beginning after reached
minRegisteredResourcesRatio: 0.0
3321 [main] INFO com.datastax.driver.core.Cluster - New Cassandra
host /192.168.4.129:9042 added
3321 [main] INFO com.datastax.driver.core.Cluster - New Cassandra
host /192.168.4.130:9042 added
3322 [main] INFO
com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
Added host 192.168.4.130 (DC1)
3322 [main] INFO com.datastax.driver.core.Cluster - New Cassandra
host /192.168.4.131:9042 added
3323 [main] INFO
com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
Added host 192.168.4.131 (DC1)
3323 [main] INFO com.datastax.driver.core.Cluster - New Cassandra
host /192.168.4.132:9042 added
3323 [main] INFO
com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
Added host 192.168.4.132 (DC1)
3325 [main] INFO
com.datastax.spark.connector.cql.CassandraConnector - Connected to
Cassandra cluster: enldbcluster
3881 [main] INFO org.apache.spark.SparkContext - Starting job:
toArray at Start.java:85
3898 [pool-18-thread-1] INFO
com.datastax.spark.connector.cql.CassandraConnector - Disconnected
from Cassandra cluster: enldbcluster
3901 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.DAGScheduler - Got job 0 (toArray at
Start.java:85) with 6 output partitions (allowLocal=false)
3902 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.DAGScheduler - Final stage: ResultStage
0(toArray at Start.java:85)
3902 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.DAGScheduler - Parents of final stage:
List()
3908 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.DAGScheduler - Missing parents: List()
3925 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.DAGScheduler - Submitting ResultStage 0
(MapPartitionsRDD[1] at map at Start.java:77), which has no missing
parents
4002 [dag-scheduler-event-loop] INFO
org.apache.spark.storage.MemoryStore - ensureFreeSpace(7488) called
with curMem=0, maxMem=478182113
4004 [dag-scheduler-event-loop] INFO
org.apache.spark.storage.MemoryStore - Block broadcast_0 stored as
values in memory (estimated size 7.3 KB, free 456.0 MB)
4013 [dag-scheduler-event-loop] INFO
org.apache.spark.storage.MemoryStore - ensureFreeSpace(4015) called
with curMem=7488, maxMem=478182113
4013 [dag-scheduler-event-loop] INFO
org.apache.spark.storage.MemoryStore - Block broadcast_0_piece0
stored as bytes in memory (estimated size 3.9 KB, free 456.0 MB)
4015 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.storage.BlockManagerInfo - Added
broadcast_0_piece0 in memory on 192.168.4.134:49150 (size: 3.9 KB,
free: 456.0 MB)
4017 [dag-scheduler-event-loop] INFO org.apache.spark.SparkContext
- Created broadcast 0 from broadcast at DAGScheduler.scala:874
4089 [dag-scheduler-event-loop] INFO
com.datastax.driver.core.Cluster - New Cassandra host
/192.168.4.129:9042 added
4089 [dag-scheduler-event-loop] INFO
com.datastax.driver.core.Cluster - New Cassandra host
/192.168.4.130:9042 added
4089 [dag-scheduler-event-loop] INFO
com.datastax.driver.core.Cluster - New Cassandra host
/192.168.4.131:9042 added
4089 [dag-scheduler-event-loop] INFO
com.datastax.driver.core.Cluster - New Cassandra host
/192.168.4.132:9042 added
4089 [dag-scheduler-event-loop] INFO
com.datastax.spark.connector.cql.CassandraConnector - Connected to
Cassandra cluster: enldbcluster
4394 [pool-18-thread-1] INFO
com.datastax.spark.connector.cql.CassandraConnector - Disconnected
from Cassandra cluster: enldbcluster
4806 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.DAGScheduler - Submitting 6 missing
tasks from ResultStage 0 (MapPartitionsRDD[1] at map at Start.java:77)
4807 [dag-scheduler-event-loop] INFO
org.apache.spark.scheduler.TaskSchedulerImpl - Adding task set 0.0
with 6 tasks
19822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
Initial job has not accepted any resources; check your cluster UI to
ensure that workers are registered and have sufficient resources
34822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
Initial job has not accepted any resources; check your cluster UI to
ensure that workers are registered and have sufficient resources
49822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
Initial job has not accepted any resources; check your cluster UI to
ensure that workers are registered and have sufficient resources
64822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
Initial job has not accepted any resources; check your cluster UI to
ensure that workers are registered and have sufficient resources
79822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
Initial job has not accepted any resources; check your cluster UI to
ensure that workers are registered and have sufficient resources
94822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl -
Initial job has not accepted any resources; check your cluster UI to
ensure that workers are registered and have sufficient resources
109822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
124822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
124963 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
updated: app-20150806054450-0001/0 is now EXITED (Command exited
with code 1)
124964 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
Executor app-20150806054450-0001/0 removed: Command exited with code 1
124968 [sparkDriver-akka.actor.default-dispatcher-17] ERROR
org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
Asked to remove non-existent executor 0
124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
added: app-20150806054450-0001/1 on
worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566) with
1 cores
124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
Granted executor ID app-20150806054450-0001/1 on hostPort
192.168.4.129:45566 with 1 cores, 200.0 MB RAM
124975 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
updated: app-20150806054450-0001/1 is now RUNNING
125012 [sparkDriver-akka.actor.default-dispatcher-14] INFO
org.apache.spark.deploy.client.AppClient$ClientActor - Executor
updated: app-20150806054450-0001/1 is now LOADING
139822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
154822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
169823 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
184822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
199822 [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl
- Initial job has not accepted any resources; check your cluster UI
to ensure that workers are registered and have sufficient resources
||
Please find attached the spark master UI and pom.xml file with dependencies.
Can anyone please point out what could be an issue here.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>enlighted</groupId>
<artifactId>ebd</artifactId>
<packaging>jar</packaging>
<version>1.0</version>
<name>ebd</name>
<properties>
<slf4j.version>1.7.12</slf4j.version>
<maven.jar.plugin.version>2.6</maven.jar.plugin.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.implementation.title>ebd</project.implementation.title>
<project.implementation.vendor>Enlighted Inc</project.implementation.vendor>
<org.springframework.version>4.2.0.RELEASE</org.springframework.version>
</properties>
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-core</artifactId>
<version>${org.springframework.version}</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.7.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.4.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.4.1</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.4.0-M2</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.4.0-M2</version>
</dependency>
</dependencies>
<build>
<finalName>ebd</finalName>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>${maven.jar.plugin.version}</version>
<configuration>
<archive>
<manifestSections>
<manifestSection>
<name>${project.name}</name>
<manifestEntries>
<Implementation-Title>${project.implementation.title}</Implementation-Title>
<Implementation-Vendor>${project.implementation.vendor}</Implementation-Vendor>
<Implementation-Version>${project.version}</Implementation-Version>
</manifestEntries>
</manifestSection>
</manifestSections>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>${maven.jar.plugin.version}</version>
<configuration>
<archive>
<manifest>
<mainClass>com.ebd.Start</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.handlers</resource>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.schemas</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
</transformers>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org