Re: Why can't Spark find the classes in this Jar?

2015-02-12 Thread Deborah Siegel
Hi Abe,
I'm new to Spark as well, so someone else could answer better. A few
thoughts which may or may not be the right line of thinking..

1) Spark properties can be set on the SparkConf, and with flags in
spark-submit, but settings on SparkConf take precedence. I think your jars
flag for spark-submit may be redundant.

1) Is there a chance that stanford-corenlp-3.5.0.jar relies on other
dependencies? I could be wrong, but perhaps if there is no other reason not
to, try building your application as an uber-jar with a build tool like
Maven, which will package the whole transitive jar. You can find
stanford-corenlp on maven central .. I think you would add the below
dependencies to your pom.xml. After building simple-project-1.0.jar with
these dependencies, you would not set jars on the sc or jar flags on
spark-submit.

dependencies
dependency
groupIdedu.stanford.nlp/groupId
artifactIdstanford-corenlp/artifactId
version3.5.0/version
/dependency
dependency
groupIdedu.stanford.nlp/groupId
artifactIdstanford-corenlp/artifactId
version3.5.0/version
classifiermodels/classifier
/dependency
/dependencies

HTH.
Deb

On Tue, Feb 10, 2015 at 1:12 PM, Abe Handler akh2...@gmail.com wrote:

 I am new to spark. I am trying to compile and run a spark application that
 requires classes from an (external) jar file on my local machine. If I open
 the jar (on ~/Desktop) I can see the missing class in the local jar but
 when
 I run spark I get

 NoClassDefFoundError: edu/stanford/nlp/ie/AbstractSequenceClassifier

 I add the jar to the spark context like this

 String[] jars = {/home/pathto/Desktop/stanford-corenlp-3.5.0.jar};
 SparkConf conf = new SparkConf().setAppName(Simple
 Application).setJars(jars);
 Then I try to run a submit script like this

 /home/me/Downloads/spark-1.2.0-bin-hadoop2.4/bin/spark-submit \
   --class SimpleApp \
   --master local[4] \
   target/simple-project-1.0.jar \
   --jars local[4] /home/abe/Desktop/stanford-corenlp-3.5.0.jar
 and hit the NoClassDefFoundError.

 I get that this means that the worker threads can't find the class from the
 jar. But I am not sure what I am doing wrong. I have tried different
 syntaxes for the last line (below) but none works.

   --addJars local[4] /home/abe/Desktop/stanford-corenlp-3.5.0.jar
   --addJars local:/home/abe/Desktop/stanford-corenlp-3.5.0.jar
   --addJars local:/home/abe/Desktop/stanford-corenlp-3.5.0.jar

 How can I fix this error?



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Why-can-t-Spark-find-the-classes-in-this-Jar-tp21584.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Why can't Spark find the classes in this Jar?

2015-02-12 Thread Sandy Ryza
What version of Java are you using?  Core NLP dropped support for Java 7 in
its 3.5.0 release.

Also, the correct command line option is --jars, not --addJars.

On Thu, Feb 12, 2015 at 12:03 PM, Deborah Siegel deborah.sie...@gmail.com
wrote:

 Hi Abe,
 I'm new to Spark as well, so someone else could answer better. A few
 thoughts which may or may not be the right line of thinking..

 1) Spark properties can be set on the SparkConf, and with flags in
 spark-submit, but settings on SparkConf take precedence. I think your jars
 flag for spark-submit may be redundant.

 1) Is there a chance that stanford-corenlp-3.5.0.jar relies on other
 dependencies? I could be wrong, but perhaps if there is no other reason not
 to, try building your application as an uber-jar with a build tool like
 Maven, which will package the whole transitive jar. You can find
 stanford-corenlp on maven central .. I think you would add the below
 dependencies to your pom.xml. After building simple-project-1.0.jar with
 these dependencies, you would not set jars on the sc or jar flags on
 spark-submit.

 dependencies
 dependency
 groupIdedu.stanford.nlp/groupId
 artifactIdstanford-corenlp/artifactId
 version3.5.0/version
 /dependency
 dependency
 groupIdedu.stanford.nlp/groupId
 artifactIdstanford-corenlp/artifactId
 version3.5.0/version
 classifiermodels/classifier
 /dependency
 /dependencies

 HTH.
 Deb

 On Tue, Feb 10, 2015 at 1:12 PM, Abe Handler akh2...@gmail.com wrote:

 I am new to spark. I am trying to compile and run a spark application that
 requires classes from an (external) jar file on my local machine. If I
 open
 the jar (on ~/Desktop) I can see the missing class in the local jar but
 when
 I run spark I get

 NoClassDefFoundError: edu/stanford/nlp/ie/AbstractSequenceClassifier

 I add the jar to the spark context like this

 String[] jars = {/home/pathto/Desktop/stanford-corenlp-3.5.0.jar};
 SparkConf conf = new SparkConf().setAppName(Simple
 Application).setJars(jars);
 Then I try to run a submit script like this

 /home/me/Downloads/spark-1.2.0-bin-hadoop2.4/bin/spark-submit \
   --class SimpleApp \
   --master local[4] \
   target/simple-project-1.0.jar \
   --jars local[4] /home/abe/Desktop/stanford-corenlp-3.5.0.jar
 and hit the NoClassDefFoundError.

 I get that this means that the worker threads can't find the class from
 the
 jar. But I am not sure what I am doing wrong. I have tried different
 syntaxes for the last line (below) but none works.

   --addJars local[4] /home/abe/Desktop/stanford-corenlp-3.5.0.jar
   --addJars local:/home/abe/Desktop/stanford-corenlp-3.5.0.jar
   --addJars local:/home/abe/Desktop/stanford-corenlp-3.5.0.jar

 How can I fix this error?



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Why-can-t-Spark-find-the-classes-in-this-Jar-tp21584.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org