FAIR scheduler in Spark Streaming

2016-01-26 Thread Sebastian Piu
Hi,

I'm trying to get *FAIR *scheduling to work in a spark streaming app
(1.6.0).

I've found a previous mailing list where it is indicated to do:

dstream.foreachRDD { rdd =>
rdd.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1") // set
the pool rdd.count() // or whatever job }

This seems to work, in the sense that If I have 5 foreachRDD in my code,
each one is sent to a different queue, but they still get executed one
after the other rather than at the same time.
Am I missing something?

The scheduler config and scheduler mode are being picked alright as I can
see them on the Spark UI

//Context config

*spark.scheduler.mode=FAIR*

Here is my scheduler config:


*  
FAIR 2
1  
FAIR 1
0  
FAIR 1
0  
FAIR 1
0  
FAIR 2
1 *


Any idea on what could be wrong?


Re: FAIR scheduler in Spark Streaming

2016-01-26 Thread Shixiong(Ryan) Zhu
The number of concurrent Streaming job is controlled by
"spark.streaming.concurrentJobs". It's 1 by default. However, you need to
keep in mind that setting it to a bigger number will allow jobs of several
batches running at the same time. It's hard to predicate the behavior and
sometimes will surprise you.

On Tue, Jan 26, 2016 at 9:57 AM, Sebastian Piu 
wrote:

> Hi,
>
> I'm trying to get *FAIR *scheduling to work in a spark streaming app
> (1.6.0).
>
> I've found a previous mailing list where it is indicated to do:
>
> dstream.foreachRDD { rdd =>
> rdd.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1") // set
> the pool rdd.count() // or whatever job }
>
> This seems to work, in the sense that If I have 5 foreachRDD in my code,
> each one is sent to a different queue, but they still get executed one
> after the other rather than at the same time.
> Am I missing something?
>
> The scheduler config and scheduler mode are being picked alright as I can
> see them on the Spark UI
>
> //Context config
>
> *spark.scheduler.mode=FAIR*
>
> Here is my scheduler config:
>
>
> *  
> FAIR 2
> 1  
> FAIR 1
> 0  
> FAIR 1
> 0  
> FAIR 1
> 0  
> FAIR 2
> 1 *
>
>
> Any idea on what could be wrong?
>


Re: FAIR scheduler in Spark Streaming

2016-01-26 Thread Sebastian Piu
Thanks Shixiong, I'll give it a try and report back

Cheers
On 26 Jan 2016 6:10 p.m., "Shixiong(Ryan) Zhu" 
wrote:

> The number of concurrent Streaming job is controlled by
> "spark.streaming.concurrentJobs". It's 1 by default. However, you need to
> keep in mind that setting it to a bigger number will allow jobs of several
> batches running at the same time. It's hard to predicate the behavior and
> sometimes will surprise you.
>
> On Tue, Jan 26, 2016 at 9:57 AM, Sebastian Piu 
> wrote:
>
>> Hi,
>>
>> I'm trying to get *FAIR *scheduling to work in a spark streaming app
>> (1.6.0).
>>
>> I've found a previous mailing list where it is indicated to do:
>>
>> dstream.foreachRDD { rdd =>
>> rdd.sparkContext.setLocalProperty("spark.scheduler.pool", "pool1") // set
>> the pool rdd.count() // or whatever job }
>>
>> This seems to work, in the sense that If I have 5 foreachRDD in my code,
>> each one is sent to a different queue, but they still get executed one
>> after the other rather than at the same time.
>> Am I missing something?
>>
>> The scheduler config and scheduler mode are being picked alright as I can
>> see them on the Spark UI
>>
>> //Context config
>>
>> *spark.scheduler.mode=FAIR*
>>
>> Here is my scheduler config:
>>
>>
>> *  
>> FAIR 2
>> 1  
>> FAIR 1
>> 0  
>> FAIR 1
>> 0  
>> FAIR 1
>> 0  
>> FAIR 2
>> 1 *
>>
>>
>> Any idea on what could be wrong?
>>
>
>


Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-16 Thread Tathagata Das
For the Spark Streaming app, if you want a particular action inside a
foreachRDD to go to a particular pool, then make sure you set the pool
within the foreachRDD function. E.g.

dstream.foreachRDD { rdd =
rdd.sparkContext.setLocalProperty(spark.scheduler.pool, pool1) //
set the pool
rdd.count() // or whatever job
}

This will ensure that the jobs will be allocated to the desired pool. LMK
if this works.

TD

On Fri, May 15, 2015 at 11:26 AM, Richard Marscher rmarsc...@localytics.com
 wrote:

 It's not a Spark Streaming app, so sorry I'm not sure of the answer to
 that. I would assume it should work.

 On Fri, May 15, 2015 at 2:22 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 Ok thanks a lot for clarifying that – btw was your application a Spark
 Streaming App – I am also looking for confirmation that FAIR scheduling is
 supported for Spark Streaming Apps



 *From:* Richard Marscher [mailto:rmarsc...@localytics.com]
 *Sent:* Friday, May 15, 2015 7:20 PM
 *To:* Evo Eftimov
 *Cc:* Tathagata Das; user
 *Subject:* Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond



 The doc is a bit confusing IMO, but at least for my application I had to
 use a fair pool configuration to get my stages to be scheduled with FAIR.



 On Fri, May 15, 2015 at 2:13 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 No pools for the moment – for each of the apps using the straightforward
 way with the spark conf param for scheduling = FAIR



 Spark is running in a Standalone Mode



 Are you saying that Configuring Pools is mandatory to get the FAIR
 scheduling working – from the docs it seemed optional to me



 *From:* Tathagata Das [mailto:t...@databricks.com]
 *Sent:* Friday, May 15, 2015 6:45 PM
 *To:* Evo Eftimov
 *Cc:* user
 *Subject:* Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond



 How are you configuring the fair scheduler pools?



 On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I have run / submitted a few Spark Streaming apps configured with Fair
 scheduling on Spark Streaming 1.2.0, however they still run in a FIFO
 mode.
 Is FAIR scheduling supported at all for Spark Streaming apps and from what
 release / version - e.g. 1.3.1




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fair-Scheduler-for-Spark-Streaming-1-2-and-beyond-tp22902.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org









RE: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Evo Eftimov
No pools for the moment – for each of the apps using the straightforward way 
with the spark conf param for scheduling = FAIR 

 

Spark is running in a Standalone Mode 

 

Are you saying that Configuring Pools is mandatory to get the FAIR scheduling 
working – from the docs it seemed optional to me 

 

From: Tathagata Das [mailto:t...@databricks.com] 
Sent: Friday, May 15, 2015 6:45 PM
To: Evo Eftimov
Cc: user
Subject: Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

 

How are you configuring the fair scheduler pools?

 

On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov evo.efti...@isecc.com wrote:

I have run / submitted a few Spark Streaming apps configured with Fair
scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode.
Is FAIR scheduling supported at all for Spark Streaming apps and from what
release / version - e.g. 1.3.1




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fair-Scheduler-for-Spark-Streaming-1-2-and-beyond-tp22902.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

 



RE: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Evo Eftimov
Ok thanks a lot for clarifying that – btw was your application a Spark 
Streaming App – I am also looking for confirmation that FAIR scheduling is 
supported for Spark Streaming Apps 

 

From: Richard Marscher [mailto:rmarsc...@localytics.com] 
Sent: Friday, May 15, 2015 7:20 PM
To: Evo Eftimov
Cc: Tathagata Das; user
Subject: Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

 

The doc is a bit confusing IMO, but at least for my application I had to use a 
fair pool configuration to get my stages to be scheduled with FAIR.

 

On Fri, May 15, 2015 at 2:13 PM, Evo Eftimov evo.efti...@isecc.com wrote:

No pools for the moment – for each of the apps using the straightforward way 
with the spark conf param for scheduling = FAIR 

 

Spark is running in a Standalone Mode 

 

Are you saying that Configuring Pools is mandatory to get the FAIR scheduling 
working – from the docs it seemed optional to me 

 

From: Tathagata Das [mailto:t...@databricks.com] 
Sent: Friday, May 15, 2015 6:45 PM
To: Evo Eftimov
Cc: user
Subject: Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

 

How are you configuring the fair scheduler pools?

 

On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov evo.efti...@isecc.com wrote:

I have run / submitted a few Spark Streaming apps configured with Fair
scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode.
Is FAIR scheduling supported at all for Spark Streaming apps and from what
release / version - e.g. 1.3.1




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fair-Scheduler-for-Spark-Streaming-1-2-and-beyond-tp22902.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

 

 



Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Richard Marscher
The doc is a bit confusing IMO, but at least for my application I had to
use a fair pool configuration to get my stages to be scheduled with FAIR.

On Fri, May 15, 2015 at 2:13 PM, Evo Eftimov evo.efti...@isecc.com wrote:

 No pools for the moment – for each of the apps using the straightforward
 way with the spark conf param for scheduling = FAIR



 Spark is running in a Standalone Mode



 Are you saying that Configuring Pools is mandatory to get the FAIR
 scheduling working – from the docs it seemed optional to me



 *From:* Tathagata Das [mailto:t...@databricks.com]
 *Sent:* Friday, May 15, 2015 6:45 PM
 *To:* Evo Eftimov
 *Cc:* user
 *Subject:* Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond



 How are you configuring the fair scheduler pools?



 On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I have run / submitted a few Spark Streaming apps configured with Fair
 scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode.
 Is FAIR scheduling supported at all for Spark Streaming apps and from what
 release / version - e.g. 1.3.1




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fair-Scheduler-for-Spark-Streaming-1-2-and-beyond-tp22902.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Richard Marscher
It's not a Spark Streaming app, so sorry I'm not sure of the answer to
that. I would assume it should work.

On Fri, May 15, 2015 at 2:22 PM, Evo Eftimov evo.efti...@isecc.com wrote:

 Ok thanks a lot for clarifying that – btw was your application a Spark
 Streaming App – I am also looking for confirmation that FAIR scheduling is
 supported for Spark Streaming Apps



 *From:* Richard Marscher [mailto:rmarsc...@localytics.com]
 *Sent:* Friday, May 15, 2015 7:20 PM
 *To:* Evo Eftimov
 *Cc:* Tathagata Das; user
 *Subject:* Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond



 The doc is a bit confusing IMO, but at least for my application I had to
 use a fair pool configuration to get my stages to be scheduled with FAIR.



 On Fri, May 15, 2015 at 2:13 PM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 No pools for the moment – for each of the apps using the straightforward
 way with the spark conf param for scheduling = FAIR



 Spark is running in a Standalone Mode



 Are you saying that Configuring Pools is mandatory to get the FAIR
 scheduling working – from the docs it seemed optional to me



 *From:* Tathagata Das [mailto:t...@databricks.com]
 *Sent:* Friday, May 15, 2015 6:45 PM
 *To:* Evo Eftimov
 *Cc:* user
 *Subject:* Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond



 How are you configuring the fair scheduler pools?



 On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I have run / submitted a few Spark Streaming apps configured with Fair
 scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode.
 Is FAIR scheduling supported at all for Spark Streaming apps and from what
 release / version - e.g. 1.3.1




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fair-Scheduler-for-Spark-Streaming-1-2-and-beyond-tp22902.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org







Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond

2015-05-15 Thread Mark Hamstra
If you don't send jobs to different pools, then they will all end up in the
default pool.  If you leave the intra-pool scheduling policy as the default
FIFO, then this will effectively be the same thing as using the default
FIFO scheduling.

Depending on what you are trying to accomplish, you need some combination
of multiple pools and FAIR scheduling within one or more pools.  And. of
course, you need to actually place a job within an appropriate pool.

On Fri, May 15, 2015 at 11:13 AM, Evo Eftimov evo.efti...@isecc.com wrote:

 No pools for the moment – for each of the apps using the straightforward
 way with the spark conf param for scheduling = FAIR



 Spark is running in a Standalone Mode



 Are you saying that Configuring Pools is mandatory to get the FAIR
 scheduling working – from the docs it seemed optional to me



 *From:* Tathagata Das [mailto:t...@databricks.com]
 *Sent:* Friday, May 15, 2015 6:45 PM
 *To:* Evo Eftimov
 *Cc:* user
 *Subject:* Re: Spark Fair Scheduler for Spark Streaming - 1.2 and beyond



 How are you configuring the fair scheduler pools?



 On Fri, May 15, 2015 at 8:33 AM, Evo Eftimov evo.efti...@isecc.com
 wrote:

 I have run / submitted a few Spark Streaming apps configured with Fair
 scheduling on Spark Streaming 1.2.0, however they still run in a FIFO mode.
 Is FAIR scheduling supported at all for Spark Streaming apps and from what
 release / version - e.g. 1.3.1




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fair-Scheduler-for-Spark-Streaming-1-2-and-beyond-tp22902.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org