The issue is that you don't have Hadoop classes in your compiler classpath. In the first example, you are getting Hadoop classes from the Spark assembly, which packages everything together.
In the second example, you are referencing Spark .jars as deployed in a Hadoop cluster. They no longer contain a copy of Hadoop classes. So you would also need to add the Hadoop .jars in the cluster to your classpath. It may be much easier to manage this as a project with SBT or Maven and let it sort out dependencies. On Wed, Jul 23, 2014 at 6:01 PM, Sameer Tilak <ssti...@live.com> wrote: > Hi everyone, > I was using Spark1.0 from Apache site and I was able to compile my code > successfully using: > > scalac -classpath > /apps/software/secondstring/secondstring/dist/lib/secondstring-20140630.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/datanucleus-api-jdo-3.2.1.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/spark-assembly-1.0.0-hadoop1.0.4.jar:spark-assembly-1.0.0-hadoop1.0.4.jar/datanucleus-core-3.2.2.jar > ComputeScores.scala > > Last week I have moved to CDH 5.1 and I am trying to compile the same by > doing the following. However, I am getting the following errors. Any help > with this will be great! > > > scalac -classpath > /apps/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar:/opt/cloudera/parcels/CDH/lib/spark/core/lib/spark-core_2.10-1.0.0-cdh5.1.0.jar:/opt/cloudera/parcels/CDH/lib/spark/lib/kryo-2.21.jar:/opt/cloudera/parcels/CDH/lib/hadoop/lib/commons-io-2.4.jar > JaccardScore.scala > > > JaccardScore.scala:37: error: bad symbolic reference. A signature in > SparkContext.class refers to term io > in package org.apache.hadoop which is not available. > It may be completely missing from the current classpath, or the version on > the classpath might be incompatible with the version used when compiling > SparkContext.class. > val mjc = new Jaccard() with Serializable > ^ > JaccardScore.scala:39: error: bad symbolic reference. A signature in > SparkContext.class refers to term io > in package org.apache.hadoop which is not available. > It may be completely missing from the current classpath, or the version on > the classpath might be incompatible with the version used when compiling > SparkContext.class. > val conf = new > SparkConf().setMaster("spark://pzxnvm2021:7077").setAppName("ApproxStrMatch") > ^ > JaccardScore.scala:51: error: bad symbolic reference. A signature in > SparkContext.class refers to term io > in package org.apache.hadoop which is not available. > It may be completely missing from the current classpath, or the version on > the classpath might be incompatible with the version used when compiling > SparkContext.class. > var scorevector = destrdd.map(x => jc_.score(str1, new > BasicStringWrapper(x)))