Is there any way that I can install the new one and remove previous version. I installed spark 1.3 on my EC2 master and set teh spark home to the new one. But when I start teh spark-shell I get - java.lang.UnsatisfiedLinkError: org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative()V at org.apache.hadoop.security.JniBasedUnixGroupsMapping.anchorNative(Native Method)
Is There no way to upgrade without creating new cluster? Thanks Roni On Wed, Mar 25, 2015 at 1:18 PM, Dean Wampler <deanwamp...@gmail.com> wrote: > Yes, that's the problem. The RDD class exists in both binary jar files, > but the signatures probably don't match. The bottom line, as always for > tools like this, is that you can't mix versions. > > Dean Wampler, Ph.D. > Author: Programming Scala, 2nd Edition > <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) > Typesafe <http://typesafe.com> > @deanwampler <http://twitter.com/deanwampler> > http://polyglotprogramming.com > > On Wed, Mar 25, 2015 at 3:13 PM, roni <roni.epi...@gmail.com> wrote: > >> My cluster is still on spark 1.2 and in SBT I am using 1.3. >> So probably it is compiling with 1.3 but running with 1.2 ? >> >> On Wed, Mar 25, 2015 at 12:34 PM, Dean Wampler <deanwamp...@gmail.com> >> wrote: >> >>> Weird. Are you running using SBT console? It should have the spark-core >>> jar on the classpath. Similarly, spark-shell or spark-submit should work, >>> but be sure you're using the same version of Spark when running as when >>> compiling. Also, you might need to add spark-sql to your SBT dependencies, >>> but that shouldn't be this issue. >>> >>> Dean Wampler, Ph.D. >>> Author: Programming Scala, 2nd Edition >>> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) >>> Typesafe <http://typesafe.com> >>> @deanwampler <http://twitter.com/deanwampler> >>> http://polyglotprogramming.com >>> >>> On Wed, Mar 25, 2015 at 12:09 PM, roni <roni.epi...@gmail.com> wrote: >>> >>>> Thanks Dean and Nick. >>>> So, I removed the ADAM and H2o from my SBT as I was not using them. >>>> I got the code to compile - only for fail while running with - >>>> SparkContext: Created broadcast 1 from textFile at >>>> kmerIntersetion.scala:21 >>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>> org/apache/spark/rdd/RDD$ >>>> at preDefKmerIntersection$.main(kmerIntersetion.scala:26) >>>> >>>> This line is where I do a "JOIN" operation. >>>> val hgPair = hgfasta.map(_.split (",")).map(a=> (a(0), >>>> a(1).trim().toInt)) >>>> val filtered = hgPair.filter(kv => kv._2 == 1) >>>> val bedPair = bedFile.map(_.split (",")).map(a=> >>>> (a(0), a(1).trim().toInt)) >>>> * val joinRDD = bedPair.join(filtered) * >>>> Any idea whats going on? >>>> I have data on the EC2 so I am avoiding creating the new cluster , but >>>> just upgrading and changing the code to use 1.3 and Spark SQL >>>> Thanks >>>> Roni >>>> >>>> >>>> >>>> On Wed, Mar 25, 2015 at 9:50 AM, Dean Wampler <deanwamp...@gmail.com> >>>> wrote: >>>> >>>>> For the Spark SQL parts, 1.3 breaks backwards compatibility, because >>>>> before 1.3, Spark SQL was considered experimental where API changes were >>>>> allowed. >>>>> >>>>> So, H2O and ADA compatible with 1.2.X might not work with 1.3. >>>>> >>>>> dean >>>>> >>>>> Dean Wampler, Ph.D. >>>>> Author: Programming Scala, 2nd Edition >>>>> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly) >>>>> Typesafe <http://typesafe.com> >>>>> @deanwampler <http://twitter.com/deanwampler> >>>>> http://polyglotprogramming.com >>>>> >>>>> On Wed, Mar 25, 2015 at 9:39 AM, roni <roni.epi...@gmail.com> wrote: >>>>> >>>>>> Even if H2o and ADA are dependent on 1.2.1 , it should be backword >>>>>> compatible, right? >>>>>> So using 1.3 should not break them. >>>>>> And the code is not using the classes from those libs. >>>>>> I tried sbt clean compile .. same errror >>>>>> Thanks >>>>>> _R >>>>>> >>>>>> On Wed, Mar 25, 2015 at 9:26 AM, Nick Pentreath < >>>>>> nick.pentre...@gmail.com> wrote: >>>>>> >>>>>>> What version of Spark do the other dependencies rely on (Adam and >>>>>>> H2O?) - that could be it >>>>>>> >>>>>>> Or try sbt clean compile >>>>>>> >>>>>>> — >>>>>>> Sent from Mailbox <https://www.dropbox.com/mailbox> >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 25, 2015 at 5:58 PM, roni <roni.epi...@gmail.com> wrote: >>>>>>> >>>>>>>> I have a EC2 cluster created using spark version 1.2.1. >>>>>>>> And I have a SBT project . >>>>>>>> Now I want to upgrade to spark 1.3 and use the new features. >>>>>>>> Below are issues . >>>>>>>> Sorry for the long post. >>>>>>>> Appreciate your help. >>>>>>>> Thanks >>>>>>>> -Roni >>>>>>>> >>>>>>>> Question - Do I have to create a new cluster using spark 1.3? >>>>>>>> >>>>>>>> Here is what I did - >>>>>>>> >>>>>>>> In my SBT file I changed to - >>>>>>>> libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" >>>>>>>> >>>>>>>> But then I started getting compilation error. along with >>>>>>>> Here are some of the libraries that were evicted: >>>>>>>> [warn] * org.apache.spark:spark-core_2.10:1.2.0 -> 1.3.0 >>>>>>>> [warn] * org.apache.hadoop:hadoop-client:(2.5.0-cdh5.2.0, 2.2.0) >>>>>>>> -> 2.6.0 >>>>>>>> [warn] Run 'evicted' to see detailed eviction warnings >>>>>>>> >>>>>>>> constructor cannot be instantiated to expected type; >>>>>>>> [error] found : (T1, T2) >>>>>>>> [error] required: org.apache.spark.sql.catalyst.expressions.Row >>>>>>>> [error] val ty = >>>>>>>> joinRDD.map{case(word, (file1Counts, file2Counts)) => >>>>>>>> KmerIntesect(word, >>>>>>>> file1Counts,"xyz")} >>>>>>>> [error] ^ >>>>>>>> >>>>>>>> Here is my SBT and code -- >>>>>>>> SBT - >>>>>>>> >>>>>>>> version := "1.0" >>>>>>>> >>>>>>>> scalaVersion := "2.10.4" >>>>>>>> >>>>>>>> resolvers += "Sonatype OSS Snapshots" at " >>>>>>>> https://oss.sonatype.org/content/repositories/snapshots"; >>>>>>>> resolvers += "Maven Repo1" at "https://repo1.maven.org/maven2"; >>>>>>>> resolvers += "Maven Repo" at " >>>>>>>> https://s3.amazonaws.com/h2o-release/h2o-dev/master/1056/maven/repo/ >>>>>>>> "; >>>>>>>> >>>>>>>> /* Dependencies - %% appends Scala version to artifactId */ >>>>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-client" % >>>>>>>> "2.6.0" >>>>>>>> libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0" >>>>>>>> libraryDependencies += "org.bdgenomics.adam" % "adam-core" % >>>>>>>> "0.16.0" >>>>>>>> libraryDependencies += "ai.h2o" % "sparkling-water-core_2.10" % >>>>>>>> "0.2.10" >>>>>>>> >>>>>>>> >>>>>>>> CODE -- >>>>>>>> import org.apache.spark.{SparkConf, SparkContext} >>>>>>>> case class KmerIntesect(kmer: String, kCount: Int, fileName: String) >>>>>>>> >>>>>>>> object preDefKmerIntersection { >>>>>>>> def main(args: Array[String]) { >>>>>>>> >>>>>>>> val sparkConf = new SparkConf().setAppName("preDefKmer-intersect") >>>>>>>> val sc = new SparkContext(sparkConf) >>>>>>>> import sqlContext.createSchemaRDD >>>>>>>> val sqlContext = new org.apache.spark.sql.SQLContext(sc) >>>>>>>> val bedFile = sc.textFile("s3n://a/b/c",40) >>>>>>>> val hgfasta = sc.textFile("hdfs://a/b/c",40) >>>>>>>> val hgPair = hgfasta.map(_.split (",")).map(a=> >>>>>>>> (a(0), a(1).trim().toInt)) >>>>>>>> val filtered = hgPair.filter(kv => kv._2 == 1) >>>>>>>> val bedPair = bedFile.map(_.split (",")).map(a=> >>>>>>>> (a(0), a(1).trim().toInt)) >>>>>>>> val joinRDD = bedPair.join(filtered) >>>>>>>> val ty = joinRDD.map{case(word, (file1Counts, >>>>>>>> file2Counts)) => KmerIntesect(word, file1Counts,"xyz")} >>>>>>>> ty.registerTempTable("KmerIntesect") >>>>>>>> >>>>>>>> ty.saveAsParquetFile("hdfs://x/y/z/kmerIntersect.parquet") >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >