FAIR scheduler in Spark Streaming

2016-01-26 Thread Sebastian Piu
Hi,

I'm trying to get *FAIR *scheduling to work in a spark streaming app
(1.6.0).

I've found a previous mailing list where it is indicated to do:

dstream.foreachRDD { rdd =>
rdd.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1") // set
the pool rdd.count() // or whatever job }

This seems to work, in the sense that If I have 5 foreachRDD in my code,
each one is sent to a different queue, but they still get executed one
after the other rather than at the same time.
Am I missing something?

The scheduler config and scheduler mode are being picked alright as I can
see them on the Spark UI

//Context config

*spark.scheduler.mode=FAIR*

Here is my scheduler config:


*  
FAIR 2
1  
FAIR 1
0  
FAIR 1
0  
FAIR 1
0  
FAIR 2
1 *


Any idea on what could be wrong?


Re: FAIR scheduler in Spark Streaming

2016-01-26 Thread Shixiong(Ryan) Zhu
The number of concurrent Streaming job is controlled by
"spark.streaming.concurrentJobs". It's 1 by default. However, you need to
keep in mind that setting it to a bigger number will allow jobs of several
batches running at the same time. It's hard to predicate the behavior and
sometimes will surprise you.

On Tue, Jan 26, 2016 at 9:57 AM, Sebastian Piu 
wrote:

> Hi,
>
> I'm trying to get *FAIR *scheduling to work in a spark streaming app
> (1.6.0).
>
> I've found a previous mailing list where it is indicated to do:
>
> dstream.foreachRDD { rdd =>
> rdd.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1") // set
> the pool rdd.count() // or whatever job }
>
> This seems to work, in the sense that If I have 5 foreachRDD in my code,
> each one is sent to a different queue, but they still get executed one
> after the other rather than at the same time.
> Am I missing something?
>
> The scheduler config and scheduler mode are being picked alright as I can
> see them on the Spark UI
>
> //Context config
>
> *spark.scheduler.mode=FAIR*
>
> Here is my scheduler config:
>
>
> *  
> FAIR 2
> 1  
> FAIR 1
> 0  
> FAIR 1
> 0  
> FAIR 1
> 0  
> FAIR 2
> 1 *
>
>
> Any idea on what could be wrong?
>


Re: FAIR scheduler in Spark Streaming

2016-01-26 Thread Sebastian Piu
Thanks Shixiong, I'll give it a try and report back

Cheers
On 26 Jan 2016 6:10 p.m., "Shixiong(Ryan) Zhu" 
wrote:

> The number of concurrent Streaming job is controlled by
> "spark.streaming.concurrentJobs". It's 1 by default. However, you need to
> keep in mind that setting it to a bigger number will allow jobs of several
> batches running at the same time. It's hard to predicate the behavior and
> sometimes will surprise you.
>
> On Tue, Jan 26, 2016 at 9:57 AM, Sebastian Piu 
> wrote:
>
>> Hi,
>>
>> I'm trying to get *FAIR *scheduling to work in a spark streaming app
>> (1.6.0).
>>
>> I've found a previous mailing list where it is indicated to do:
>>
>> dstream.foreachRDD { rdd =>
>> rdd.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1") // set
>> the pool rdd.count() // or whatever job }
>>
>> This seems to work, in the sense that If I have 5 foreachRDD in my code,
>> each one is sent to a different queue, but they still get executed one
>> after the other rather than at the same time.
>> Am I missing something?
>>
>> The scheduler config and scheduler mode are being picked alright as I can
>> see them on the Spark UI
>>
>> //Context config
>>
>> *spark.scheduler.mode=FAIR*
>>
>> Here is my scheduler config:
>>
>>
>> *  
>> FAIR 2
>> 1  
>> FAIR 1
>> 0  
>> FAIR 1
>> 0  
>> FAIR 1
>> 0  
>> FAIR 2
>> 1 *
>>
>>
>> Any idea on what could be wrong?
>>
>
>