My cluster is still on spark 1.2 and in SBT I am using 1.3. So probably it is compiling with 1.3 but running with 1.2 ?
On Wed, Mar 25, 2015 at 12:34 PM, Dean Wampler <[email protected]> wrote: > Weird. Are you running using SBT console? It should have the spark-core > jar on the classpath. Similarly, spark-shell or spark-submit should work, > but be sure you're using the same version of Spark when running as when > compiling. Also, you might need to add spark-sql to your SBT dependencies, > but that shouldn't be this issue. > > Dean Wampler, Ph.D. > Author: Programming Scala, 2nd Edition > <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) > Typesafe <http://typesafe.com> > @deanwampler <http://twitter.com/deanwampler> > http://polyglotprogramming.com > > On Wed, Mar 25, 2015 at 12:09 PM, roni <[email protected]> wrote: > >> Thanks Dean and Nick. >> So, I removed the ADAM and H2o from my SBT as I was not using them. >> I got the code to compile - only for fail while running with - >> SparkContext: Created broadcast 1 from textFile at >> kmerIntersetion.scala:21 >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/spark/rdd/RDD$ >> at preDefKmerIntersection$.main(kmerIntersetion.scala:26) >> >> This line is where I do a "JOIN" operation. >> val hgPair = hgfasta.map(_.split (",")).map(a=> (a(0), a(1).trim().toInt)) >> val filtered = hgPair.filter(kv => kv._2 == 1) >> val bedPair = bedFile.map(_.split (",")).map(a=> (a(0), >> a(1).trim().toInt)) >> * val joinRDD = bedPair.join(filtered) * >> Any idea whats going on? >> I have data on the EC2 so I am avoiding creating the new cluster , but >> just upgrading and changing the code to use 1.3 and Spark SQL >> Thanks >> Roni >> >> >> >> On Wed, Mar 25, 2015 at 9:50 AM, Dean Wampler <[email protected]> >> wrote: >> >>> For the Spark SQL parts, 1.3 breaks backwards compatibility, because >>> before 1.3, Spark SQL was considered experimental where API changes were >>> allowed. >>> >>> So, H2O and ADA compatible with 1.2.X might not work with 1.3. >>> >>> dean >>> >>> Dean Wampler, Ph.D. >>> Author: Programming Scala, 2nd Edition >>> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) >>> Typesafe <http://typesafe.com> >>> @deanwampler <http://twitter.com/deanwampler> >>> http://polyglotprogramming.com >>> >>> On Wed, Mar 25, 2015 at 9:39 AM, roni <[email protected]> wrote: >>> >>>> Even if H2o and ADA are dependent on 1.2.1 , it should be backword >>>> compatible, right? >>>> So using 1.3 should not break them. >>>> And the code is not using the classes from those libs. >>>> I tried sbt clean compile .. same errror >>>> Thanks >>>> _R >>>> >>>> On Wed, Mar 25, 2015 at 9:26 AM, Nick Pentreath < >>>> [email protected]> wrote: >>>> >>>>> What version of Spark do the other dependencies rely on (Adam and >>>>> H2O?) - that could be it >>>>> >>>>> Or try sbt clean compile >>>>> >>>>> — >>>>> Sent from Mailbox <https://www.dropbox.com/mailbox> >>>>> >>>>> >>>>> On Wed, Mar 25, 2015 at 5:58 PM, roni <[email protected]> wrote: >>>>> >>>>>> I have a EC2 cluster created using spark version 1.2.1. >>>>>> And I have a SBT project . >>>>>> Now I want to upgrade to spark 1.3 and use the new features. >>>>>> Below are issues . >>>>>> Sorry for the long post. >>>>>> Appreciate your help. >>>>>> Thanks >>>>>> -Roni >>>>>> >>>>>> Question - Do I have to create a new cluster using spark 1.3? >>>>>> >>>>>> Here is what I did - >>>>>> >>>>>> In my SBT file I changed to - >>>>>> libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" >>>>>> >>>>>> But then I started getting compilation error. along with >>>>>> Here are some of the libraries that were evicted: >>>>>> [warn] * org.apache.spark:spark-core_2.10:1.2.0 -> 1.3.0 >>>>>> [warn] * org.apache.hadoop:hadoop-client:(2.5.0-cdh5.2.0, 2.2.0) -> >>>>>> 2.6.0 >>>>>> [warn] Run 'evicted' to see detailed eviction warnings >>>>>> >>>>>> constructor cannot be instantiated to expected type; >>>>>> [error] found : (T1, T2) >>>>>> [error] required: org.apache.spark.sql.catalyst.expressions.Row >>>>>> [error] val ty = >>>>>> joinRDD.map{case(word, (file1Counts, file2Counts)) => KmerIntesect(word, >>>>>> file1Counts,"xyz")} >>>>>> [error] ^ >>>>>> >>>>>> Here is my SBT and code -- >>>>>> SBT - >>>>>> >>>>>> version := "1.0" >>>>>> >>>>>> scalaVersion := "2.10.4" >>>>>> >>>>>> resolvers += "Sonatype OSS Snapshots" at " >>>>>> https://oss.sonatype.org/content/repositories/snapshots"; >>>>>> resolvers += "Maven Repo1" at "https://repo1.maven.org/maven2"; >>>>>> resolvers += "Maven Repo" at " >>>>>> https://s3.amazonaws.com/h2o-release/h2o-dev/master/1056/maven/repo/ >>>>>> "; >>>>>> >>>>>> /* Dependencies - %% appends Scala version to artifactId */ >>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.6.0" >>>>>> libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" >>>>>> libraryDependencies += "org.bdgenomics.adam" % "adam-core" % "0.16.0" >>>>>> libraryDependencies += "ai.h2o" % "sparkling-water-core_2.10" % >>>>>> "0.2.10" >>>>>> >>>>>> >>>>>> CODE -- >>>>>> import org.apache.spark.{SparkConf, SparkContext} >>>>>> case class KmerIntesect(kmer: String, kCount: Int, fileName: String) >>>>>> >>>>>> object preDefKmerIntersection { >>>>>> def main(args: Array[String]) { >>>>>> >>>>>> val sparkConf = new SparkConf().setAppName("preDefKmer-intersect") >>>>>> val sc = new SparkContext(sparkConf) >>>>>> import sqlContext.createSchemaRDD >>>>>> val sqlContext = new org.apache.spark.sql.SQLContext(sc) >>>>>> val bedFile = sc.textFile("s3n://a/b/c",40) >>>>>> val hgfasta = sc.textFile("hdfs://a/b/c",40) >>>>>> val hgPair = hgfasta.map(_.split (",")).map(a=> >>>>>> (a(0), a(1).trim().toInt)) >>>>>> val filtered = hgPair.filter(kv => kv._2 == 1) >>>>>> val bedPair = bedFile.map(_.split (",")).map(a=> >>>>>> (a(0), a(1).trim().toInt)) >>>>>> val joinRDD = bedPair.join(filtered) >>>>>> val ty = joinRDD.map{case(word, (file1Counts, >>>>>> file2Counts)) => KmerIntesect(word, file1Counts,"xyz")} >>>>>> ty.registerTempTable("KmerIntesect") >>>>>> >>>>>> ty.saveAsParquetFile("hdfs://x/y/z/kmerIntersect.parquet") >>>>>> } >>>>>> } >>>>>> >>>>>> >>>>> >>>> >>> >> >
