Hi Sean,Thanks for the quick reply. I moved to an sbt-based build and I was able to build the project successfully. In my /apps/sameert/software/approxstrmatch I see the following: jar -tf target/scala-2.10/approxstrmatch_2.10-1.0.jarMETA-INF/MANIFEST.MFapproxstrmatch/approxstrmatch/MyRegistrator.classapproxstrmatch/JaccardScore$$anonfun$calculateJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateAnotatedJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$4.classapproxstrmatch/JaccardScore$$anon$1.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$3.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateAnotatedJaccardScore$1$$anonfun$2.classapproxstrmatch/JaccardScore.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$5.classapproxstrmatch/JaccardScore$$anonfun$calculateJaccardScore$1$$anonfun$1.class
However, when I start my spark shell: spark-shell --jars /apps/sameert/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar /apps/sameert/software/approxstrmatch/target/scala-2.10/approxstrmatch_2.10-1.0.jar I type the following interactively, I get error, not sure what I am missing now. This used to work before. val srcFile = sc.textFile("hdfs://ipaddr:8020/user/sameert/approxstrmatch/target-sentences.csv")val distFile = sc.textFile("hdfs://ipaddr:8020/user/sameert/approxstrmatch/sameer_sentence_filter.tsv") val score = new approxstrmatch.JaccardScore()error: not found: value approxstrmatch > From: so...@cloudera.com > Date: Wed, 23 Jul 2014 18:11:34 +0100 > Subject: Re: error: bad symbolic reference. A signature in SparkContext.class > refers to term io in package org.apache.hadoop which is not available > To: user@spark.apache.org > > The issue is that you don't have Hadoop classes in your compiler > classpath. In the first example, you are getting Hadoop classes from > the Spark assembly, which packages everything together. > > In the second example, you are referencing Spark .jars as deployed in > a Hadoop cluster. They no longer contain a copy of Hadoop classes. So > you would also need to add the Hadoop .jars in the cluster to your > classpath. > > It may be much easier to manage this as a project with SBT or Maven > and let it sort out dependencies. > > On Wed, Jul 23, 2014 at 6:01 PM, Sameer Tilak <ssti...@live.com> wrote: > > Hi everyone, > > I was using Spark1.0 from Apache site and I was able to compile my code > > successfully using: > > > > scalac -classpath > > /apps/software/secondstring/secondstring/dist/lib/secondstring-20140630.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/datanucleus-api-jdo-3.2.1.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/spark-assembly-1.0.0-hadoop1.0.4.jar:spark-assembly-1.0.0-hadoop1.0.4.jar/datanucleus-core-3.2.2.jar > > ComputeScores.scala > > > > Last week I have moved to CDH 5.1 and I am trying to compile the same by > > doing the following. However, I am getting the following errors. Any help > > with this will be great! > > > > > > scalac -classpath > > /apps/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar:/opt/cloudera/parcels/CDH/lib/spark/core/lib/spark-core_2.10-1.0.0-cdh5.1.0.jar:/opt/cloudera/parcels/CDH/lib/spark/lib/kryo-2.21.jar:/opt/cloudera/parcels/CDH/lib/hadoop/lib/commons-io-2.4.jar > > JaccardScore.scala > > > > > > JaccardScore.scala:37: error: bad symbolic reference. A signature in > > SparkContext.class refers to term io > > in package org.apache.hadoop which is not available. > > It may be completely missing from the current classpath, or the version on > > the classpath might be incompatible with the version used when compiling > > SparkContext.class. > > val mjc = new Jaccard() with Serializable > > ^ > > JaccardScore.scala:39: error: bad symbolic reference. A signature in > > SparkContext.class refers to term io > > in package org.apache.hadoop which is not available. > > It may be completely missing from the current classpath, or the version on > > the classpath might be incompatible with the version used when compiling > > SparkContext.class. > > val conf = new > > SparkConf().setMaster("spark://pzxnvm2021:7077").setAppName("ApproxStrMatch") > > ^ > > JaccardScore.scala:51: error: bad symbolic reference. A signature in > > SparkContext.class refers to term io > > in package org.apache.hadoop which is not available. > > It may be completely missing from the current classpath, or the version on > > the classpath might be incompatible with the version used when compiling > > SparkContext.class. > > var scorevector = destrdd.map(x => jc_.score(str1, new > > BasicStringWrapper(x)))