Re: Running Spark in local mode seems to ignore local[N]

Dmitry Goldenberg Mon, 11 May 2015 14:18:06 -0700

Sean,

How does this model actually work? Let's say we want to run one job as N
threads executing one particular task, e.g. streaming data out of Kafka
into a search engine.  How do we configure our Spark job execution?


Right now, I'm seeing this job running as a single thread. And it's quite a
bit slower than just running a simple utility with a thread executor with a
thread pool of N threads doing the same task.

The performance I'm seeing of running the Kafka-Spark Streaming job is 7
times slower than that of the utility.  What's pulling Spark back?

Thanks.


On Mon, May 11, 2015 at 4:55 PM, Sean Owen <so...@cloudera.com> wrote:

> You have one worker with one executor with 32 execution slots.
>
> On Mon, May 11, 2015 at 9:52 PM, dgoldenberg <dgoldenberg...@gmail.com>
> wrote:
> > Hi,
> >
> > Is there anything special one must do, running locally and submitting a
> job
> > like so:
> >
> > spark-submit \
> >         --class "com.myco.Driver" \
> >         --master local[*]  \
> >         ./lib/myco.jar
> >
> > In my logs, I'm only seeing log messages with the thread identifier of
> > "Executor task launch worker-0".
> >
> > There are 4 cores on the machine so I expected 4 threads to be at play.
> > Running with local[32] did not yield 32 worker threads.
> >
> > Any recommendations? Thanks.
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Running-Spark-in-local-mode-seems-to-ignore-local-N-tp22851.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>

Re: Running Spark in local mode seems to ignore local[N]

Reply via email to