Yes, I am sharing the cluster across many jobs, and each jobs only needs 8
cores (in fact, because the jobs are so small and are counting uniques, it
only gets slower as you add more cores). My question is how to limit each
job to only use 8 cores, but have the entire cluster available for that
You are absolutely correct, I apologize.
My understanding was that you are sharing the machine across many jobs. That
was the context in which I was making that comment.
-adrian
Sent from my iPhone
On 03 Oct 2015, at 07:03, Philip Weaver
Philip, the guy is trying to help you. Calling him silly is a bit too far. He
might assume your problem is IO bound which might not be the case. If you need
only 4 cores per job no matter what there is little advantage to use spark in
my opinion because you can easily do this with just a worker
I believe I've described my use case clearly, and I'm being questioned that
it's legitimate. I will assert again that if you don't understand my use
case, it really doesn't make sense to make any statement about how many
resources I should need.
And I'm sorry, but I completely disagree with your
Since I'm running Spark on Mesos, to be fair I should give Mesos credit,
too! And I should also put some effort into describing what I'm trying to
accomplish of more clearly. There are really three levels of scheduling
that I'm hoping to exploit:
- Scheduling in Mesos across all frameworks, where
You can't really say 8 cores is not much horsepower when you have no idea
what my use case is. That's silly.
On Fri, Sep 18, 2015 at 10:33 PM, Adrian Tanase wrote:
> Forgot to mention that you could also restrict the parallelism to 4,
> essentially using only 4 cores at any
Reading through the docs it seems that with a combination of FAIR scheduler and
maybe pools you can get pretty far.
However the smallest unit of scheduled work is the task so probably you need to
think about the parallelism of each transformation.
I'm guessing that by increasing the level of
Forgot to mention that you could also restrict the parallelism to 4,
essentially using only 4 cores at any given time, however if your job is
complex, a stage might be broken into more than 1 task...
Sent from my iPhone
On 19 Sep 2015, at 08:30, Adrian Tanase
Here's a specific example of what I want to do. My Spark application is
running with total-executor-cores=8. A request comes in, it spawns a thread
to handle that request, and starts a job. That job should use only 4 cores,
not all 8 of the cores available to the cluster.. When the first job is
(whoops, redundant sentence in that first paragraph)
On Fri, Sep 18, 2015 at 8:36 AM, Philip Weaver
wrote:
> Here's a specific example of what I want to do. My Spark application is
> running with total-executor-cores=8. A request comes in, it spawns a thread
> to handle
I'm playing around with dynamic allocation in spark-1.5.0, with the FAIR
scheduler, so I can define a long-running application capable of executing
multiple simultaneous spark jobs.
The kind of jobs that I'm running do not benefit from more than 4 cores,
but I want my application to be able to
11 matches
Mail list logo