Re: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available
The issue is that you don't have Hadoop classes in your compiler classpath. In the first example, you are getting Hadoop classes from the Spark assembly, which packages everything together. In the second example, you are referencing Spark .jars as deployed in a Hadoop cluster. They no longer contain a copy of Hadoop classes. So you would also need to add the Hadoop .jars in the cluster to your classpath. It may be much easier to manage this as a project with SBT or Maven and let it sort out dependencies. On Wed, Jul 23, 2014 at 6:01 PM, Sameer Tilak ssti...@live.com wrote: Hi everyone, I was using Spark1.0 from Apache site and I was able to compile my code successfully using: scalac -classpath /apps/software/secondstring/secondstring/dist/lib/secondstring-20140630.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/datanucleus-api-jdo-3.2.1.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/spark-assembly-1.0.0-hadoop1.0.4.jar:spark-assembly-1.0.0-hadoop1.0.4.jar/datanucleus-core-3.2.2.jar ComputeScores.scala Last week I have moved to CDH 5.1 and I am trying to compile the same by doing the following. However, I am getting the following errors. Any help with this will be great! scalac -classpath /apps/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar:/opt/cloudera/parcels/CDH/lib/spark/core/lib/spark-core_2.10-1.0.0-cdh5.1.0.jar:/opt/cloudera/parcels/CDH/lib/spark/lib/kryo-2.21.jar:/opt/cloudera/parcels/CDH/lib/hadoop/lib/commons-io-2.4.jar JaccardScore.scala JaccardScore.scala:37: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. val mjc = new Jaccard() with Serializable ^ JaccardScore.scala:39: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. val conf = new SparkConf().setMaster(spark://pzxnvm2021:7077).setAppName(ApproxStrMatch) ^ JaccardScore.scala:51: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. var scorevector = destrdd.map(x = jc_.score(str1, new BasicStringWrapper(x)))
RE: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available
Hi Sean,Thanks for the quick reply. I moved to an sbt-based build and I was able to build the project successfully. In my /apps/sameert/software/approxstrmatch I see the following: jar -tf target/scala-2.10/approxstrmatch_2.10-1.0.jarMETA-INF/MANIFEST.MFapproxstrmatch/approxstrmatch/MyRegistrator.classapproxstrmatch/JaccardScore$$anonfun$calculateJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateAnotatedJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$4.classapproxstrmatch/JaccardScore$$anon$1.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$3.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateAnotatedJaccardScore$1$$anonfun$2.classapproxstrmatch/JaccardScore.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$5.classapproxstrmatch/JaccardScore$$anonfun$calculateJaccardScore$1$$anonfun$1.class However, when I start my spark shell: spark-shell --jars /apps/sameert/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar /apps/sameert/software/approxstrmatch/target/scala-2.10/approxstrmatch_2.10-1.0.jar I type the following interactively, I get error, not sure what I am missing now. This used to work before. val srcFile = sc.textFile(hdfs://ipaddr:8020/user/sameert/approxstrmatch/target-sentences.csv)val distFile = sc.textFile(hdfs://ipaddr:8020/user/sameert/approxstrmatch/sameer_sentence_filter.tsv) val score = new approxstrmatch.JaccardScore()error: not found: value approxstrmatch From: so...@cloudera.com Date: Wed, 23 Jul 2014 18:11:34 +0100 Subject: Re: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available To: user@spark.apache.org The issue is that you don't have Hadoop classes in your compiler classpath. In the first example, you are getting Hadoop classes from the Spark assembly, which packages everything together. In the second example, you are referencing Spark .jars as deployed in a Hadoop cluster. They no longer contain a copy of Hadoop classes. So you would also need to add the Hadoop .jars in the cluster to your classpath. It may be much easier to manage this as a project with SBT or Maven and let it sort out dependencies. On Wed, Jul 23, 2014 at 6:01 PM, Sameer Tilak ssti...@live.com wrote: Hi everyone, I was using Spark1.0 from Apache site and I was able to compile my code successfully using: scalac -classpath /apps/software/secondstring/secondstring/dist/lib/secondstring-20140630.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/datanucleus-api-jdo-3.2.1.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/spark-assembly-1.0.0-hadoop1.0.4.jar:spark-assembly-1.0.0-hadoop1.0.4.jar/datanucleus-core-3.2.2.jar ComputeScores.scala Last week I have moved to CDH 5.1 and I am trying to compile the same by doing the following. However, I am getting the following errors. Any help with this will be great! scalac -classpath /apps/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar:/opt/cloudera/parcels/CDH/lib/spark/core/lib/spark-core_2.10-1.0.0-cdh5.1.0.jar:/opt/cloudera/parcels/CDH/lib/spark/lib/kryo-2.21.jar:/opt/cloudera/parcels/CDH/lib/hadoop/lib/commons-io-2.4.jar JaccardScore.scala JaccardScore.scala:37: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. val mjc = new Jaccard() with Serializable ^ JaccardScore.scala:39: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. val conf = new SparkConf().setMaster(spark://pzxnvm2021:7077).setAppName(ApproxStrMatch) ^ JaccardScore.scala:51: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. var scorevector = destrdd.map(x = jc_.score(str1, new BasicStringWrapper(x)))
RE: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available
I was able to resolve this, In my spark-shell command I forgot to add a comma in between two jar files. From: ssti...@live.com To: user@spark.apache.org Subject: RE: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available Date: Wed, 23 Jul 2014 11:29:03 -0700 Hi Sean,Thanks for the quick reply. I moved to an sbt-based build and I was able to build the project successfully. In my /apps/sameert/software/approxstrmatch I see the following: jar -tf target/scala-2.10/approxstrmatch_2.10-1.0.jarMETA-INF/MANIFEST.MFapproxstrmatch/approxstrmatch/MyRegistrator.classapproxstrmatch/JaccardScore$$anonfun$calculateJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateAnotatedJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$4.classapproxstrmatch/JaccardScore$$anon$1.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$3.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1.classapproxstrmatch/JaccardScore$$anonfun$calculateAnotatedJaccardScore$1$$anonfun$2.classapproxstrmatch/JaccardScore.classapproxstrmatch/JaccardScore$$anonfun$calculateSortedJaccardScore$1$$anonfun$5.classapproxstrmatch/JaccardScore$$anonfun$calculateJaccardScore$1$$anonfun$1.class However, when I start my spark shell: spark-shell --jars /apps/sameert/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar /apps/sameert/software/approxstrmatch/target/scala-2.10/approxstrmatch_2.10-1.0.jar I type the following interactively, I get error, not sure what I am missing now. This used to work before. val srcFile = sc.textFile(hdfs://ipaddr:8020/user/sameert/approxstrmatch/target-sentences.csv)val distFile = sc.textFile(hdfs://ipaddr:8020/user/sameert/approxstrmatch/sameer_sentence_filter.tsv) val score = new approxstrmatch.JaccardScore()error: not found: value approxstrmatch From: so...@cloudera.com Date: Wed, 23 Jul 2014 18:11:34 +0100 Subject: Re: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available To: user@spark.apache.org The issue is that you don't have Hadoop classes in your compiler classpath. In the first example, you are getting Hadoop classes from the Spark assembly, which packages everything together. In the second example, you are referencing Spark .jars as deployed in a Hadoop cluster. They no longer contain a copy of Hadoop classes. So you would also need to add the Hadoop .jars in the cluster to your classpath. It may be much easier to manage this as a project with SBT or Maven and let it sort out dependencies. On Wed, Jul 23, 2014 at 6:01 PM, Sameer Tilak ssti...@live.com wrote: Hi everyone, I was using Spark1.0 from Apache site and I was able to compile my code successfully using: scalac -classpath /apps/software/secondstring/secondstring/dist/lib/secondstring-20140630.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/datanucleus-api-jdo-3.2.1.jar:/apps/software/spark-1.0.0-bin-hadoop1/lib/spark-assembly-1.0.0-hadoop1.0.4.jar:spark-assembly-1.0.0-hadoop1.0.4.jar/datanucleus-core-3.2.2.jar ComputeScores.scala Last week I have moved to CDH 5.1 and I am trying to compile the same by doing the following. However, I am getting the following errors. Any help with this will be great! scalac -classpath /apps/software/secondstring/secondstring/dist/lib/secondstring-20140723.jar:/opt/cloudera/parcels/CDH/lib/spark/core/lib/spark-core_2.10-1.0.0-cdh5.1.0.jar:/opt/cloudera/parcels/CDH/lib/spark/lib/kryo-2.21.jar:/opt/cloudera/parcels/CDH/lib/hadoop/lib/commons-io-2.4.jar JaccardScore.scala JaccardScore.scala:37: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. val mjc = new Jaccard() with Serializable ^ JaccardScore.scala:39: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. val conf = new SparkConf().setMaster(spark://pzxnvm2021:7077).setAppName(ApproxStrMatch) ^ JaccardScore.scala:51: error: bad symbolic reference. A signature in SparkContext.class refers to term io in package org.apache.hadoop which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling SparkContext.class. var scorevector = destrdd.map(x =