@Sean, the %% syntax in SBT should automatically add the Scala major version qualifier (_2.10, _2.11 etc) for you, so that does appear to be correct syntax for the build.
I seemed to run into this issue with some missing Jackson deps, and solved it by including the jar explicitly on the driver class path: bin/spark-submit *-* *-driver-class-path SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar* --class "SimpleApp" SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar Seems redundant to me since I thought that the JAR as argument is copied to driver and made available. But this solved it for me so perhaps give it a try? On Wed, Jun 4, 2014 at 3:01 PM, Sean Owen <so...@cloudera.com> wrote: > Those aren't the names of the artifacts: > > > http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-streaming-twitter_2.10%22 > > The name is "spark-streaming-twitter_2.10" > > On Wed, Jun 4, 2014 at 1:49 PM, Jeremy Lee > <unorthodox.engine...@gmail.com> wrote: > > Man, this has been hard going. Six days, and I finally got a "Hello > World" > > App working that I wrote myself. > > > > Now I'm trying to make a minimal streaming app based on the twitter > > examples, (running standalone right now while learning) and when running > it > > like this: > > > > bin/spark-submit --class "SimpleApp" > > SimpleApp/target/scala-2.10/simple-project_2.10-1.0.jar > > > > I'm getting this error: > > > > Exception in thread "main" java.lang.NoClassDefFoundError: > > org/apache/spark/streaming/twitter/TwitterUtils$ > > > > Which I'm guessing is because I haven't put in a dependency to > > "external/twitter" in the .sbt, but _how_? I can't find any docs on it. > > Here's my build file so far: > > > > simple.sbt > > ------------------------------------------ > > name := "Simple Project" > > > > version := "1.0" > > > > scalaVersion := "2.10.4" > > > > libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0" > > > > libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.0.0" > > > > libraryDependencies += "org.apache.spark" %% "spark-streaming-twitter" % > > "1.0.0" > > > > libraryDependencies += "org.twitter4j" % "twitter4j-stream" % "3.0.3" > > > > resolvers += "Akka Repository" at "http://repo.akka.io/releases/" > > ------------------------------------------ > > > > I've tried a few obvious things like adding: > > > > libraryDependencies += "org.apache.spark" %% "spark-external" % "1.0.0" > > > > libraryDependencies += "org.apache.spark" %% "spark-external-twitter" % > > "1.0.0" > > > > because, well, that would match the naming scheme implied so far, but it > > errors. > > > > > > Also, I just realized I don't completely understand if: > > (a) the "spark-submit" command _sends_ the .jar to all the workers, or > > (b) the "spark-submit" commands sends a _job_ to the workers, which are > > supposed to already have the jar file installed (or in hdfs), or > > (c) the Context is supposed to list the jars to be distributed. (is that > > deprecated?) > > > > One part of the documentation says: > > > > "Once you have an assembled jar you can call the bin/spark-submit > script as > > shown here while passing your jar." > > > > but another says: > > > > "application-jar: Path to a bundled jar including your application and > all > > dependencies. The URL must be globally visible inside of your cluster, > for > > instance, an hdfs:// path or a file:// path that is present on all > nodes." > > > > I suppose both could be correct if you take a certain point of view. > > > > -- > > Jeremy Lee BCompSci(Hons) > > The Unorthodox Engineers >