Sean, How does this model actually work? Let's say we want to run one job as N threads executing one particular task, e.g. streaming data out of Kafka into a search engine. How do we configure our Spark job execution?
Right now, I'm seeing this job running as a single thread. And it's quite a bit slower than just running a simple utility with a thread executor with a thread pool of N threads doing the same task. The performance I'm seeing of running the Kafka-Spark Streaming job is 7 times slower than that of the utility. What's pulling Spark back? Thanks. On Mon, May 11, 2015 at 4:55 PM, Sean Owen <so...@cloudera.com> wrote: > You have one worker with one executor with 32 execution slots. > > On Mon, May 11, 2015 at 9:52 PM, dgoldenberg <dgoldenberg...@gmail.com> > wrote: > > Hi, > > > > Is there anything special one must do, running locally and submitting a > job > > like so: > > > > spark-submit \ > > --class "com.myco.Driver" \ > > --master local[*] \ > > ./lib/myco.jar > > > > In my logs, I'm only seeing log messages with the thread identifier of > > "Executor task launch worker-0". > > > > There are 4 cores on the machine so I expected 4 threads to be at play. > > Running with local[32] did not yield 32 worker threads. > > > > Any recommendations? Thanks. > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Running-Spark-in-local-mode-seems-to-ignore-local-N-tp22851.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > >