sounds like the first job is occupying all resources. you should limit the
resources that a single job can acquire.
fair scheduler is one way to do that.
a possibly simpler way is to configured spark.deploy.defaultCores or
spark.cores.max?
the defaults for these values - for the Spark default cluster resource
manager (aka Spark Standalone) - is infinite. every job will try to
acquire every resource.
https://spark.apache.org/docs/latest/spark-standalone.html
here's an example config that i use for my reference data pipeline project:
https://github.com/fluxcapacitor/pipeline/blob/master/config/spark/spark-defaults.conf
i'm always playing with these values to simulate different conditions, but
that's the current snapshot that might be helpful.
also, don't forget about executor memory...
On Fri, Feb 12, 2016 at 1:40 PM, Silvio Fiorito <
silvio.fior...@granturing.com> wrote:
> You’ll want to setup the FAIR scheduler as described here:
> https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application
>
> From: yael aharon
> Date: Friday, February 12, 2016 at 2:00 PM
> To: "user@spark.apache.org"
> Subject: Allowing parallelism in spark local mode
>
> Hello,
> I have an application that receives requests over HTTP and uses spark in
> local mode to process the requests. Each request is running in its own
> thread.
> It seems that spark is queueing the jobs, processing them one at a time.
> When 2 requests arrive simultaneously, the processing time for each of them
> is almost doubled.
> I tried setting spark.default.parallelism, spark.executor.cores,
> spark.driver.cores but that did not change the time in a meaningful way.
>
> Am I missing something obvious?
> thanks, Yael
>
>
--
*Chris Fregly*
Principal Data Solutions Engineer
IBM Spark Technology Center, San Francisco, CA
http://spark.tc | http://advancedspark.com