@Sean, the %% syntax in SBT should automatically add the Scala major
version qualifier (_2.10, _2.11 etc) for you, so that does appear to be
correct syntax for the build.

I seemed to run into this issue with some missing Jackson deps, and solved
it by including the jar explicitly on the driver class path:

bin/spark-submit *-*
*-driver-class-path
SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar* --class
"SimpleApp" SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar

Seems redundant to me since I thought that the JAR as argument is copied to
driver and made available. But this solved it for me so perhaps give it a
try?



On Wed, Jun 4, 2014 at 3:01 PM, Sean Owen <so...@cloudera.com> wrote:

> Those aren't the names of the artifacts:
>
>
> http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-streaming-twitter_2.10%22
>
> The name is "spark-streaming-twitter_2.10"
>
> On Wed, Jun 4, 2014 at 1:49 PM, Jeremy Lee
> <unorthodox.engine...@gmail.com> wrote:
> > Man, this has been hard going. Six days, and I finally got a "Hello
> World"
> > App working that I wrote myself.
> >
> > Now I'm trying to make a minimal streaming app based on the twitter
> > examples, (running standalone right now while learning) and when running
> it
> > like this:
> >
> > bin/spark-submit --class "SimpleApp"
> > SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar
> >
> > I'm getting this error:
> >
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/spark/streaming/twitter/TwitterUtils$
> >
> > Which I'm guessing is because I haven't put in a dependency to
> > "external/twitter" in the .sbt, but _how_? I can't find any docs on it.
> > Here's my build file so far:
> >
> > simple.sbt
> > ------------------------------------------
> > name := "Simple Project"
> >
> > version := "1.0"
> >
> > scalaVersion := "2.10.4"
> >
> > libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"
> >
> > libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.0.0"
> >
> > libraryDependencies += "org.apache.spark" %% "spark-streaming-twitter" %
> > "1.0.0"
> >
> > libraryDependencies += "org.twitter4j" % "twitter4j-stream" % "3.0.3"
> >
> > resolvers += "Akka Repository" at "http://repo.akka.io/releases/";
> > ------------------------------------------
> >
> > I've tried a few obvious things like adding:
> >
> > libraryDependencies += "org.apache.spark" %% "spark-external" % "1.0.0"
> >
> > libraryDependencies += "org.apache.spark" %% "spark-external-twitter" %
> > "1.0.0"
> >
> > because, well, that would match the naming scheme implied so far, but it
> > errors.
> >
> >
> > Also, I just realized I don't completely understand if:
> > (a) the "spark-submit" command _sends_ the .jar to all the workers, or
> > (b) the "spark-submit" commands sends a _job_ to the workers, which are
> > supposed to already have the jar file installed (or in hdfs), or
> > (c) the Context is supposed to list the jars to be distributed. (is that
> > deprecated?)
> >
> > One part of the documentation says:
> >
> >  "Once you have an assembled jar you can call the bin/spark-submit
> script as
> > shown here while passing your jar."
> >
> > but another says:
> >
> > "application-jar: Path to a bundled jar including your application and
> all
> > dependencies. The URL must be globally visible inside of your cluster,
> for
> > instance, an hdfs:// path or a file:// path that is present on all
> nodes."
> >
> > I suppose both could be correct if you take a certain point of view.
> >
> > --
> > Jeremy Lee  BCompSci(Hons)
> >   The Unorthodox Engineers
>

Reply via email to