Re: Limiting number of cores per job in multi-threaded driver.

2015-10-04 Thread Philip Weaver
Yes, I am sharing the cluster across many jobs, and each jobs only needs 8 cores (in fact, because the jobs are so small and are counting uniques, it only gets slower as you add more cores). My question is how to limit each job to only use 8 cores, but have the entire cluster available for that

Re: Limiting number of cores per job in multi-threaded driver.

2015-10-04 Thread Adrian Tanase
You are absolutely correct, I apologize. My understanding was that you are sharing the machine across many jobs. That was the context in which I was making that comment. -adrian Sent from my iPhone On 03 Oct 2015, at 07:03, Philip Weaver

Re: Limiting number of cores per job in multi-threaded driver.

2015-10-04 Thread Jerry Lam
Philip, the guy is trying to help you. Calling him silly is a bit too far. He might assume your problem is IO bound which might not be the case. If you need only 4 cores per job no matter what there is little advantage to use spark in my opinion because you can easily do this with just a worker

Re: Limiting number of cores per job in multi-threaded driver.

2015-10-04 Thread Philip Weaver
I believe I've described my use case clearly, and I'm being questioned that it's legitimate. I will assert again that if you don't understand my use case, it really doesn't make sense to make any statement about how many resources I should need. And I'm sorry, but I completely disagree with your

Re: Limiting number of cores per job in multi-threaded driver.

2015-10-04 Thread Philip Weaver
Since I'm running Spark on Mesos, to be fair I should give Mesos credit, too! And I should also put some effort into describing what I'm trying to accomplish of more clearly. There are really three levels of scheduling that I'm hoping to exploit: - Scheduling in Mesos across all frameworks, where

Re: Limiting number of cores per job in multi-threaded driver.

2015-10-02 Thread Philip Weaver
You can't really say 8 cores is not much horsepower when you have no idea what my use case is. That's silly. On Fri, Sep 18, 2015 at 10:33 PM, Adrian Tanase wrote: > Forgot to mention that you could also restrict the parallelism to 4, > essentially using only 4 cores at any

Re: Limiting number of cores per job in multi-threaded driver.

2015-09-18 Thread Adrian Tanase
Reading through the docs it seems that with a combination of FAIR scheduler and maybe pools you can get pretty far. However the smallest unit of scheduled work is the task so probably you need to think about the parallelism of each transformation. I'm guessing that by increasing the level of

Re: Limiting number of cores per job in multi-threaded driver.

2015-09-18 Thread Adrian Tanase
Forgot to mention that you could also restrict the parallelism to 4, essentially using only 4 cores at any given time, however if your job is complex, a stage might be broken into more than 1 task... Sent from my iPhone On 19 Sep 2015, at 08:30, Adrian Tanase

Re: Limiting number of cores per job in multi-threaded driver.

2015-09-18 Thread Philip Weaver
Here's a specific example of what I want to do. My Spark application is running with total-executor-cores=8. A request comes in, it spawns a thread to handle that request, and starts a job. That job should use only 4 cores, not all 8 of the cores available to the cluster.. When the first job is

Re: Limiting number of cores per job in multi-threaded driver.

2015-09-18 Thread Philip Weaver
(whoops, redundant sentence in that first paragraph) On Fri, Sep 18, 2015 at 8:36 AM, Philip Weaver wrote: > Here's a specific example of what I want to do. My Spark application is > running with total-executor-cores=8. A request comes in, it spawns a thread > to handle

Limiting number of cores per job in multi-threaded driver.

2015-09-12 Thread Philip Weaver
I'm playing around with dynamic allocation in spark-1.5.0, with the FAIR scheduler, so I can define a long-running application capable of executing multiple simultaneous spark jobs. The kind of jobs that I'm running do not benefit from more than 4 cores, but I want my application to be able to