RE: Adding Spark Cassandra dependency breaks Spark Streaming?

2014-12-06 Thread Ashic Mahtab
Update:
It seems the following combo causes things in spark streaming to go missing:
spark-core 1.1.0spark-streaming 1.1.0spark-cassandra-connector 1.1.0
The moment I add the three together, things like StreamingContext and Seconds 
are unavailable. sbt assembly fails saying those aren't there. Sbt clean / 
deleting .ivy2 and .m2 doesn't resolve the issue.
I've also set up an 1.1.1 spark cluster, and created a jar with the following 
dependencies:
spark-core 1.1.1spark-streaming 1.1.1spark-sql 1.1.1spark-cassandra-connector 
1.1.0
Everything runs perfectly.
I'll be upgrading my clusters to 1.1.1 anyway, but I am intrigued...I'm fairly 
new to sbt, scala and the jvm in general. Any idea how having spark streaming 
1.1.0 and spark cassandra connector 1.1.0 together would cause classes in spark 
streaming to go missing?
Here's the full sbt file if anybody is interested:
import sbt._
import Keys._



name := untitled19

version := 1.0

scalaVersion := 2.10.4

val sparkCore = org.apache.spark %% spark-core % 1.1.0 % provided
val sparkStreaming = org.apache.spark %% spark-streaming % 1.1.0 % 
provided
val sparkSql = org.apache.spark %% spark-sql % 1.1.0 % provided
val sparkCassandra = com.datastax.spark %% spark-cassandra-connector % 
1.1.0 withSources() withJavadoc()

libraryDependencies ++= Seq(
  sparkCore,
  sparkSql,
  sparkStreaming,
  sparkCassandra
)

resolvers += Akka Repository at http://repo.akka.io/releases/;

assemblyMergeStrategy in assembly := {
  case PathList(META-INF, xs@_*) =
(xs map (_.toLowerCase)) match {
  case (manifest.mf :: Nil) | (index.list :: Nil) | (dependencies :: 
Nil) = MergeStrategy.discard
  case _ = MergeStrategy.discard
}
  case _ = MergeStrategy.first
} 
Regards,Ashic.

From: as...@live.com
To: yuzhih...@gmail.com
CC: user@spark.apache.org
Subject: RE: Adding Spark Cassandra dependency breaks Spark Streaming?
Date: Sat, 6 Dec 2014 03:54:19 +




Getting this on the home machine as well. Not referencing the spark cassandra 
connector in libraryDependencies compiles. 
I've recently updated IntelliJ to 14. Could that be causing an issue? 

From: as...@live.com
To: yuzhih...@gmail.com
CC: user@spark.apache.org
Subject: RE: Adding Spark Cassandra dependency breaks Spark Streaming?
Date: Fri, 5 Dec 2014 19:24:46 +




Sorry...really don't have enough maven know how to do this quickly. I tried the 
pom below, and IntelliJ could find org.apache.spark.streaming.StreamingContext 
and org.apache.spark.streaming.Seconds, but not 
org.apache.spark.streaming.receiver.Receiver. Is there something specific I can 
try? I'll try sbt on the home machine in about a couple of hours.

?xml version=1.0 encoding=UTF-8?
project xmlns=http://maven.apache.org/POM/4.0.0;
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;
modelVersion4.0.0/modelVersion

groupIduntitled100/groupId
artifactIduntiled100/artifactId
version1.0-SNAPSHOT/version

dependencies
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-core_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-streaming_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdcom.datastax.spark/groupId
artifactIdspark-cassandra-connector_2.10/artifactId
version1.1.0/version
/dependency
/dependencies

/project




Date: Fri, 5 Dec 2014 10:58:51 -0800
Subject: Re: Adding Spark Cassandra dependency breaks Spark Streaming?
From: yuzhih...@gmail.com
To: as...@live.com
CC: user@spark.apache.org

Can you try with maven ?
diff --git a/streaming/pom.xml b/streaming/pom.xmlindex b8b8f2e..6cc8102 
100644--- a/streaming/pom.xml+++ b/streaming/pom.xml@@ -68,6 +68,11 @@   
artifactIdjunit-interface/artifactId   scopetest/scope 
/dependency+dependency+  groupIdcom.datastax.spark/groupId+ 
 artifactIdspark-cassandra-connector_2.10/artifactId+  
version1.1.0/version+/dependency   /dependencies   build 
outputDirectorytarget/scala-${scala.binary.version}/classes/outputDirectory
You can use the following command:mvn -pl core,streaming package -DskipTests

Cheers
On Fri, Dec 5, 2014 at 9:35 AM, Ashic Mahtab as...@live.com wrote:



Hi,
Seems adding the cassandra connector and spark streaming causes issues. I've 
added by build and code file. Running sbt compile gives weird errors like 
Seconds is not part of org.apache.spark.streaming and object Receiver is not a 
member of package org.apache.spark.streaming.receiver. If I take out 
cassandraConnector from the list of dependencies, sbt compile succeeds.
How is adding the dependency removing things from spark streaming packages? Is 
there something I can do (perhaps in sbt) to not have this break

RE: Adding Spark Cassandra dependency breaks Spark Streaming?

2014-12-06 Thread Ashic Mahtab
Hi,Just checked cassandra connector 1.1.0-beta1 runs fine. The issue seems 
to be 1.1.0 for spark streaming and 1.1.0 cassandra connector (final).
Regards,Ashic.

Date: Sat, 6 Dec 2014 13:52:20 -0500
Subject: Re: Adding Spark Cassandra dependency breaks Spark Streaming?
From: jayunit100.apa...@gmail.com
To: as...@live.com

This is working for me as a dependency set for spark streaming app w/ cassandra.

https://github.com/jayunit100/SparkBlueprint/blob/master/build.sbt



  
  




  
  

libraryDependencies += com.datastax.spark %% 
spark-cassandra-connector % 1.1.0-beta1 withSources() withJavadoc()
  
  




  
  

libraryDependencies += org.apache.spark %% spark-core % 1.1.0
  
  




  
  

libraryDependencies +=  org.scalatest % scalatest_2.10.0-M4 % 
1.9-2.10.0-M4-B1
  
  




  
  

libraryDependencies +=  junit % junit % 4.8.1 % test
  
  




  
  

libraryDependencies += org.apache.spark %% spark-mllib % 1.1.0
  
  




  
  

libraryDependencies += org.apache.spark %% spark-sql % 1.1.0
  
  




  
  

libraryDependencies += org.apache.spark %% spark-streaming % 1.1.0
  
  




On Sat, Dec 6, 2014 at 12:29 PM, Ashic Mahtab as...@live.com wrote:



Update:
It seems the following combo causes things in spark streaming to go missing:
spark-core 1.1.0spark-streaming 1.1.0spark-cassandra-connector 1.1.0
The moment I add the three together, things like StreamingContext and Seconds 
are unavailable. sbt assembly fails saying those aren't there. Sbt clean / 
deleting .ivy2 and .m2 doesn't resolve the issue.
I've also set up an 1.1.1 spark cluster, and created a jar with the following 
dependencies:
spark-core 1.1.1spark-streaming 1.1.1spark-sql 1.1.1spark-cassandra-connector 
1.1.0
Everything runs perfectly.
I'll be upgrading my clusters to 1.1.1 anyway, but I am intrigued...I'm fairly 
new to sbt, scala and the jvm in general. Any idea how having spark streaming 
1.1.0 and spark cassandra connector 1.1.0 together would cause classes in spark 
streaming to go missing?
Here's the full sbt file if anybody is interested:
import sbt._
import Keys._



name := untitled19

version := 1.0

scalaVersion := 2.10.4

val sparkCore = org.apache.spark %% spark-core % 1.1.0 % provided
val sparkStreaming = org.apache.spark %% spark-streaming % 1.1.0 % 
provided
val sparkSql = org.apache.spark %% spark-sql % 1.1.0 % provided
val sparkCassandra = com.datastax.spark %% spark-cassandra-connector % 
1.1.0 withSources() withJavadoc()

libraryDependencies ++= Seq(
  sparkCore,
  sparkSql,
  sparkStreaming,
  sparkCassandra
)

resolvers += Akka Repository at http://repo.akka.io/releases/;

assemblyMergeStrategy in assembly := {
  case PathList(META-INF, xs@_*) =
(xs map (_.toLowerCase)) match {
  case (manifest.mf :: Nil) | (index.list :: Nil) | (dependencies :: 
Nil) = MergeStrategy.discard
  case _ = MergeStrategy.discard
}
  case _ = MergeStrategy.first
} 
Regards,Ashic.

From: as...@live.com
To: yuzhih...@gmail.com
CC: user@spark.apache.org
Subject: RE: Adding Spark Cassandra dependency breaks Spark Streaming?
Date: Sat, 6 Dec 2014 03:54:19 +




Getting this on the home machine as well. Not referencing the spark cassandra 
connector in libraryDependencies compiles. 
I've recently updated IntelliJ to 14. Could that be causing an issue? 

From: as...@live.com
To: yuzhih...@gmail.com
CC: user@spark.apache.org
Subject: RE: Adding Spark Cassandra dependency breaks Spark Streaming?
Date: Fri, 5 Dec 2014 19:24:46 +




Sorry...really don't have enough maven know how to do this quickly. I tried the 
pom below, and IntelliJ could find org.apache.spark.streaming.StreamingContext 
and org.apache.spark.streaming.Seconds, but not 
org.apache.spark.streaming.receiver.Receiver. Is there something specific I can 
try? I'll try sbt on the home machine in about a couple of hours.

?xml version=1.0 encoding=UTF-8?
project xmlns=http://maven.apache.org/POM/4.0.0;
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;
modelVersion4.0.0/modelVersion

groupIduntitled100/groupId
artifactIduntiled100/artifactId
version1.0-SNAPSHOT/version

dependencies
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-core_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-streaming_2.10/artifactId
version1.1.0/version
/dependency
dependency

Adding Spark Cassandra dependency breaks Spark Streaming?

2014-12-05 Thread Ashic Mahtab
Hi,
Seems adding the cassandra connector and spark streaming causes issues. I've 
added by build and code file. Running sbt compile gives weird errors like 
Seconds is not part of org.apache.spark.streaming and object Receiver is not a 
member of package org.apache.spark.streaming.receiver. If I take out 
cassandraConnector from the list of dependencies, sbt compile succeeds.
How is adding the dependency removing things from spark streaming packages? Is 
there something I can do (perhaps in sbt) to not have this break?

Here's my build file:
import sbt.Keys._import sbt._
name := untitled99
version := 1.0
scalaVersion := 2.10.4
val spark = org.apache.spark %% spark-core % 1.1.0val sparkStreaming = 
org.apache.spark %% spark-streaming % 1.1.0val cassandraConnector = 
com.datastax.spark %% spark-cassandra-connector % 1.1.0 withSources() 
withJavadoc()
libraryDependencies ++= Seq(cassandraConnector,spark,sparkStreaming)
resolvers += Akka Repository at http://repo.akka.io/releases/;
And here's my code:
import org.apache.spark.SparkContextimport 
org.apache.spark.storage.StorageLevelimport 
org.apache.spark.streaming.{Seconds, StreamingContext}import 
org.apache.spark.streaming.receiver.Receiverobject Foo {def main(args: 
Array[String]) {val context = new SparkContext()val ssc = new 
StreamingContext(context, Seconds(2))}}class Bar extends Receiver[Int]{override 
def onStart(): Unit = ???override def onStop(): Unit = ???}
  

Re: Adding Spark Cassandra dependency breaks Spark Streaming?

2014-12-05 Thread Ted Yu
Can you try with maven ?

diff --git a/streaming/pom.xml b/streaming/pom.xml
index b8b8f2e..6cc8102 100644
--- a/streaming/pom.xml
+++ b/streaming/pom.xml
@@ -68,6 +68,11 @@
   artifactIdjunit-interface/artifactId
   scopetest/scope
 /dependency
+dependency
+  groupIdcom.datastax.spark/groupId
+  artifactIdspark-cassandra-connector_2.10/artifactId
+  version1.1.0/version
+/dependency
   /dependencies
   build

 outputDirectorytarget/scala-${scala.binary.version}/classes/outputDirectory

You can use the following command:
mvn -pl core,streaming package -DskipTests

Cheers

On Fri, Dec 5, 2014 at 9:35 AM, Ashic Mahtab as...@live.com wrote:

 Hi,

 Seems adding the cassandra connector and spark streaming causes issues.
 I've added by build and code file. Running sbt compile gives weird errors
 like Seconds is not part of org.apache.spark.streaming and object Receiver
 is not a member of package org.apache.spark.streaming.receiver. If I take
 out cassandraConnector from the list of dependencies, sbt compile
 succeeds.


 How is adding the dependency removing things from spark streaming
 packages? Is there something I can do (perhaps in sbt) to not have this
 break?


 Here's my build file:



 *import sbt.Keys._import sbt._*


 *name := untitled99*


 *version := 1.0*


 *scalaVersion := 2.10.4*




 *val spark = org.apache.spark %% spark-core % 1.1.0val
 sparkStreaming = org.apache.spark %% spark-streaming % 1.1.0val
 cassandraConnector = com.datastax.spark %% spark-cassandra-connector %
 1.1.0 withSources() withJavadoc()*






 *libraryDependencies ++= Seq(cassandraConnector,spark,sparkStreaming)*


 *resolvers += Akka Repository at http://repo.akka.io/releases/
 http://repo.akka.io/releases/*


 And here's my code:





 *import org.apache.spark.SparkContextimport
 org.apache.spark.storage.StorageLevelimport
 org.apache.spark.streaming.{Seconds, StreamingContext}import
 org.apache.spark.streaming.receiver.Receiver*






 *object Foo {def main(args: Array[String]) {val context = new
 SparkContext()val ssc = new StreamingContext(context, Seconds(2))}}*


 *class Bar extends Receiver[Int]
 https://github.com/datastax/spark-cassandra-connector/issues/StorageLevel.MEMORY_AND_DISK_2{override
 def onStart(): Unit = ???*


 *override def onStop(): Unit = ???}*




RE: Adding Spark Cassandra dependency breaks Spark Streaming?

2014-12-05 Thread Ashic Mahtab
Sorry...really don't have enough maven know how to do this quickly. I tried the 
pom below, and IntelliJ could find org.apache.spark.streaming.StreamingContext 
and org.apache.spark.streaming.Seconds, but not 
org.apache.spark.streaming.receiver.Receiver. Is there something specific I can 
try? I'll try sbt on the home machine in about a couple of hours.

?xml version=1.0 encoding=UTF-8?
project xmlns=http://maven.apache.org/POM/4.0.0;
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;
modelVersion4.0.0/modelVersion

groupIduntitled100/groupId
artifactIduntiled100/artifactId
version1.0-SNAPSHOT/version

dependencies
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-core_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-streaming_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdcom.datastax.spark/groupId
artifactIdspark-cassandra-connector_2.10/artifactId
version1.1.0/version
/dependency
/dependencies

/project




Date: Fri, 5 Dec 2014 10:58:51 -0800
Subject: Re: Adding Spark Cassandra dependency breaks Spark Streaming?
From: yuzhih...@gmail.com
To: as...@live.com
CC: user@spark.apache.org

Can you try with maven ?
diff --git a/streaming/pom.xml b/streaming/pom.xmlindex b8b8f2e..6cc8102 
100644--- a/streaming/pom.xml+++ b/streaming/pom.xml@@ -68,6 +68,11 @@   
artifactIdjunit-interface/artifactId   scopetest/scope 
/dependency+dependency+  groupIdcom.datastax.spark/groupId+ 
 artifactIdspark-cassandra-connector_2.10/artifactId+  
version1.1.0/version+/dependency   /dependencies   build 
outputDirectorytarget/scala-${scala.binary.version}/classes/outputDirectory
You can use the following command:mvn -pl core,streaming package -DskipTests

Cheers
On Fri, Dec 5, 2014 at 9:35 AM, Ashic Mahtab as...@live.com wrote:



Hi,
Seems adding the cassandra connector and spark streaming causes issues. I've 
added by build and code file. Running sbt compile gives weird errors like 
Seconds is not part of org.apache.spark.streaming and object Receiver is not a 
member of package org.apache.spark.streaming.receiver. If I take out 
cassandraConnector from the list of dependencies, sbt compile succeeds.
How is adding the dependency removing things from spark streaming packages? Is 
there something I can do (perhaps in sbt) to not have this break?

Here's my build file:
import sbt.Keys._
import sbt._
name := untitled99
version := 1.0
scalaVersion := 2.10.4
val spark = org.apache.spark %% spark-core % 1.1.0
val sparkStreaming = org.apache.spark %% spark-streaming % 1.1.0
val cassandraConnector = com.datastax.spark %% spark-cassandra-connector % 
1.1.0 withSources() withJavadoc()
libraryDependencies ++= Seq(
cassandraConnector,
spark,
sparkStreaming
)
resolvers += Akka Repository at http://repo.akka.io/releases/;
And here's my code:
import org.apache.spark.SparkContext
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.receiver.Receiverobject Foo {
def main(args: Array[String]) {
val context = new SparkContext()
val ssc = new StreamingContext(context, Seconds(2))
}
}class Bar extends Receiver[Int]{
override def onStart(): Unit = ???override def onStop(): Unit = ???
}
  

  

RE: Adding Spark Cassandra dependency breaks Spark Streaming?

2014-12-05 Thread Ashic Mahtab
Getting this on the home machine as well. Not referencing the spark cassandra 
connector in libraryDependencies compiles. 
I've recently updated IntelliJ to 14. Could that be causing an issue? 

From: as...@live.com
To: yuzhih...@gmail.com
CC: user@spark.apache.org
Subject: RE: Adding Spark Cassandra dependency breaks Spark Streaming?
Date: Fri, 5 Dec 2014 19:24:46 +




Sorry...really don't have enough maven know how to do this quickly. I tried the 
pom below, and IntelliJ could find org.apache.spark.streaming.StreamingContext 
and org.apache.spark.streaming.Seconds, but not 
org.apache.spark.streaming.receiver.Receiver. Is there something specific I can 
try? I'll try sbt on the home machine in about a couple of hours.

?xml version=1.0 encoding=UTF-8?
project xmlns=http://maven.apache.org/POM/4.0.0;
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;
modelVersion4.0.0/modelVersion

groupIduntitled100/groupId
artifactIduntiled100/artifactId
version1.0-SNAPSHOT/version

dependencies
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-core_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdorg.apache.spark/groupId
artifactIdspark-streaming_2.10/artifactId
version1.1.0/version
/dependency
dependency
groupIdcom.datastax.spark/groupId
artifactIdspark-cassandra-connector_2.10/artifactId
version1.1.0/version
/dependency
/dependencies

/project




Date: Fri, 5 Dec 2014 10:58:51 -0800
Subject: Re: Adding Spark Cassandra dependency breaks Spark Streaming?
From: yuzhih...@gmail.com
To: as...@live.com
CC: user@spark.apache.org

Can you try with maven ?
diff --git a/streaming/pom.xml b/streaming/pom.xmlindex b8b8f2e..6cc8102 
100644--- a/streaming/pom.xml+++ b/streaming/pom.xml@@ -68,6 +68,11 @@   
artifactIdjunit-interface/artifactId   scopetest/scope 
/dependency+dependency+  groupIdcom.datastax.spark/groupId+ 
 artifactIdspark-cassandra-connector_2.10/artifactId+  
version1.1.0/version+/dependency   /dependencies   build 
outputDirectorytarget/scala-${scala.binary.version}/classes/outputDirectory
You can use the following command:mvn -pl core,streaming package -DskipTests

Cheers
On Fri, Dec 5, 2014 at 9:35 AM, Ashic Mahtab as...@live.com wrote:



Hi,
Seems adding the cassandra connector and spark streaming causes issues. I've 
added by build and code file. Running sbt compile gives weird errors like 
Seconds is not part of org.apache.spark.streaming and object Receiver is not a 
member of package org.apache.spark.streaming.receiver. If I take out 
cassandraConnector from the list of dependencies, sbt compile succeeds.
How is adding the dependency removing things from spark streaming packages? Is 
there something I can do (perhaps in sbt) to not have this break?

Here's my build file:
import sbt.Keys._
import sbt._
name := untitled99
version := 1.0
scalaVersion := 2.10.4
val spark = org.apache.spark %% spark-core % 1.1.0
val sparkStreaming = org.apache.spark %% spark-streaming % 1.1.0
val cassandraConnector = com.datastax.spark %% spark-cassandra-connector % 
1.1.0 withSources() withJavadoc()
libraryDependencies ++= Seq(
cassandraConnector,
spark,
sparkStreaming
)
resolvers += Akka Repository at http://repo.akka.io/releases/;
And here's my code:
import org.apache.spark.SparkContext
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.receiver.Receiverobject Foo {
def main(args: Array[String]) {
val context = new SparkContext()
val ssc = new StreamingContext(context, Seconds(2))
}
}class Bar extends Receiver[Int]{
override def onStart(): Unit = ???override def onStop(): Unit = ???
}