Re: Graceful shutdown of spark streaming on yarn

2016-05-13 Thread Rakesh H (Marketing Platform-BLR)
Have you used awaitTermination() on your ssc ? --> Yes, i have used that.
Also try setting the deployment mode to yarn-client. --> Is this not
supported on yarn-cluster mode?  I am trying to find root cause for
yarn-cluster mode.
Have you tested graceful shutdown on yarn-cluster mode?

On Fri, May 13, 2016 at 11:54 AM Deepak Sharma 
wrote:

> Rakesh
> Have you used awaitTermination() on your ssc ?
> If not , dd this and see if it changes the behavior.
> I am guessing this issue may be related to yarn deployment mode.
> Also try setting the deployment mode to yarn-client.
>
> Thanks
> Deepak
>
>
> On Fri, May 13, 2016 at 10:17 AM, Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> Ping!!
>> Has anybody tested graceful shutdown of a spark streaming in yarn-cluster
>> mode?It looks like a defect to me.
>>
>>
>> On Thu, May 12, 2016 at 12:53 PM Rakesh H (Marketing Platform-BLR) <
>> rakes...@flipkart.com> wrote:
>>
>>> We are on spark 1.5.1
>>> Above change was to add a shutdown hook.
>>> I am not adding shutdown hook in code, so inbuilt shutdown hook is being
>>> called.
>>> Driver signals that it is going to to graceful shutdown, but executor
>>> sees that Driver is dead and it shuts down abruptly.
>>> Could this issue be related to yarn? I see correct behavior locally. I
>>> did "yarn kill " to kill the job.
>>>
>>>
>>> On Thu, May 12, 2016 at 12:28 PM Deepak Sharma 
>>> wrote:
>>>
 This is happening because spark context shuts down without shutting
 down the ssc first.
 This was behavior till spark 1.4 ans was addressed in later releases.
 https://github.com/apache/spark/pull/6307

 Which version of spark are you on?

 Thanks
 Deepak

 On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) <
 rakes...@flipkart.com> wrote:

> Yes, it seems to be the case.
> In this case executors should have continued logging values till 300,
> but they are shutdown as soon as i do "yarn kill .."
>
> On Thu, May 12, 2016 at 12:11 PM Deepak Sharma 
> wrote:
>
>> So in your case , the driver is shutting down gracefully , but the
>> executors are not.
>> IS this the problem?
>>
>> Thanks
>> Deepak
>>
>> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
>> rakes...@flipkart.com> wrote:
>>
>>> Yes, it is set to true.
>>> Log of driver :
>>>
>>> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: 
>>> SIGTERM
>>> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
>>> stop(stopGracefully=true) from shutdown hook
>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
>>> gracefully
>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
>>> blocks to be consumed for job generation
>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received 
>>> blocks to be consumed for job generation
>>>
>>> Log of executor:
>>> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
>>> xx.xx.xx.xx:x disassociated! Shutting down.
>>> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association 
>>> with remote system [xx.xx.xx.xx:x] has failed, address is now gated 
>>> for [5000] ms. Reason: [Disassociated]
>>> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE 
>>> -> 204 //This is value i am logging
>>> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE 
>>> -> 205
>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE 
>>> -> 206
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma <
>>> deepakmc...@gmail.com> wrote:
>>>
 Hi Rakesh
 Did you tried setting *spark.streaming.stopGracefullyOnShutdown to
 true *for your spark configuration instance?
 If not try this , and let us know if this helps.

 Thanks
 Deepak

 On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR)
  wrote:

> Issue i am having is similar to the one mentioned here :
>
> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>
> I am creating a rdd from sequence of 1 to 300 and creating
> streaming RDD out of it.
>
> val rdd = ssc.sparkContext.parallelize(1 to 300)
> val dstream = new ConstantInputDStream(ssc, rdd)
> dstream.foreachRDD{ rdd =>
>   rdd.foreach{ x =>
> log(x)

Re: Graceful shutdown of spark streaming on yarn

2016-05-13 Thread Deepak Sharma
Rakesh
Have you used awaitTermination() on your ssc ?
If not , dd this and see if it changes the behavior.
I am guessing this issue may be related to yarn deployment mode.
Also try setting the deployment mode to yarn-client.

Thanks
Deepak


On Fri, May 13, 2016 at 10:17 AM, Rakesh H (Marketing Platform-BLR) <
rakes...@flipkart.com> wrote:

> Ping!!
> Has anybody tested graceful shutdown of a spark streaming in yarn-cluster
> mode?It looks like a defect to me.
>
>
> On Thu, May 12, 2016 at 12:53 PM Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> We are on spark 1.5.1
>> Above change was to add a shutdown hook.
>> I am not adding shutdown hook in code, so inbuilt shutdown hook is being
>> called.
>> Driver signals that it is going to to graceful shutdown, but executor
>> sees that Driver is dead and it shuts down abruptly.
>> Could this issue be related to yarn? I see correct behavior locally. I
>> did "yarn kill " to kill the job.
>>
>>
>> On Thu, May 12, 2016 at 12:28 PM Deepak Sharma 
>> wrote:
>>
>>> This is happening because spark context shuts down without shutting down
>>> the ssc first.
>>> This was behavior till spark 1.4 ans was addressed in later releases.
>>> https://github.com/apache/spark/pull/6307
>>>
>>> Which version of spark are you on?
>>>
>>> Thanks
>>> Deepak
>>>
>>> On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) <
>>> rakes...@flipkart.com> wrote:
>>>
 Yes, it seems to be the case.
 In this case executors should have continued logging values till 300,
 but they are shutdown as soon as i do "yarn kill .."

 On Thu, May 12, 2016 at 12:11 PM Deepak Sharma 
 wrote:

> So in your case , the driver is shutting down gracefully , but the
> executors are not.
> IS this the problem?
>
> Thanks
> Deepak
>
> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> Yes, it is set to true.
>> Log of driver :
>>
>> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: 
>> SIGTERM
>> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
>> stop(stopGracefully=true) from shutdown hook
>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
>> gracefully
>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
>> blocks to be consumed for job generation
>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received 
>> blocks to be consumed for job generation
>>
>> Log of executor:
>> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
>> xx.xx.xx.xx:x disassociated! Shutting down.
>> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association 
>> with remote system [xx.xx.xx.xx:x] has failed, address is now gated 
>> for [5000] ms. Reason: [Disassociated]
>> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
>> 204 //This is value i am logging
>> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
>> 205
>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
>> 206
>>
>>
>>
>>
>>
>>
>> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
>> wrote:
>>
>>> Hi Rakesh
>>> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to
>>> true *for your spark configuration instance?
>>> If not try this , and let us know if this helps.
>>>
>>> Thanks
>>> Deepak
>>>
>>> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR)
>>>  wrote:
>>>
 Issue i am having is similar to the one mentioned here :

 http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn

 I am creating a rdd from sequence of 1 to 300 and creating
 streaming RDD out of it.

 val rdd = ssc.sparkContext.parallelize(1 to 300)
 val dstream = new ConstantInputDStream(ssc, rdd)
 dstream.foreachRDD{ rdd =>
   rdd.foreach{ x =>
 log(x)
 Thread.sleep(50)
   }
 }


 When i kill this job, i expect elements 1 to 300 to be logged
 before shutting down. It is indeed the case when i run it locally. It 
 wait
 for the job to finish before shutting down.

 But when i launch the job in custer with "yarn-cluster" mode, it
 abruptly shuts down.
 Executor prints following log

 ERROR executor.CoarseGrainedExecutorBackend:

Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Rakesh H (Marketing Platform-BLR)
Ping!!
Has anybody tested graceful shutdown of a spark streaming in yarn-cluster
mode?It looks like a defect to me.

On Thu, May 12, 2016 at 12:53 PM Rakesh H (Marketing Platform-BLR) <
rakes...@flipkart.com> wrote:

> We are on spark 1.5.1
> Above change was to add a shutdown hook.
> I am not adding shutdown hook in code, so inbuilt shutdown hook is being
> called.
> Driver signals that it is going to to graceful shutdown, but executor sees
> that Driver is dead and it shuts down abruptly.
> Could this issue be related to yarn? I see correct behavior locally. I did
> "yarn kill " to kill the job.
>
>
> On Thu, May 12, 2016 at 12:28 PM Deepak Sharma 
> wrote:
>
>> This is happening because spark context shuts down without shutting down
>> the ssc first.
>> This was behavior till spark 1.4 ans was addressed in later releases.
>> https://github.com/apache/spark/pull/6307
>>
>> Which version of spark are you on?
>>
>> Thanks
>> Deepak
>>
>> On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) <
>> rakes...@flipkart.com> wrote:
>>
>>> Yes, it seems to be the case.
>>> In this case executors should have continued logging values till 300,
>>> but they are shutdown as soon as i do "yarn kill .."
>>>
>>> On Thu, May 12, 2016 at 12:11 PM Deepak Sharma 
>>> wrote:
>>>
 So in your case , the driver is shutting down gracefully , but the
 executors are not.
 IS this the problem?

 Thanks
 Deepak

 On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
 rakes...@flipkart.com> wrote:

> Yes, it is set to true.
> Log of driver :
>
> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: 
> SIGTERM
> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
> stop(stopGracefully=true) from shutdown hook
> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
> gracefully
> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
> blocks to be consumed for job generation
> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received 
> blocks to be consumed for job generation
>
> Log of executor:
> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> xx.xx.xx.xx:x disassociated! Shutting down.
> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association 
> with remote system [xx.xx.xx.xx:x] has failed, address is now gated 
> for [5000] ms. Reason: [Disassociated]
> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
> 204 //This is value i am logging
> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
> 205
> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
> 206
>
>
>
>
>
>
> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
> wrote:
>
>> Hi Rakesh
>> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to
>> true *for your spark configuration instance?
>> If not try this , and let us know if this helps.
>>
>> Thanks
>> Deepak
>>
>> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
>> rakes...@flipkart.com> wrote:
>>
>>> Issue i am having is similar to the one mentioned here :
>>>
>>> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>>>
>>> I am creating a rdd from sequence of 1 to 300 and creating streaming
>>> RDD out of it.
>>>
>>> val rdd = ssc.sparkContext.parallelize(1 to 300)
>>> val dstream = new ConstantInputDStream(ssc, rdd)
>>> dstream.foreachRDD{ rdd =>
>>>   rdd.foreach{ x =>
>>> log(x)
>>> Thread.sleep(50)
>>>   }
>>> }
>>>
>>>
>>> When i kill this job, i expect elements 1 to 300 to be logged before
>>> shutting down. It is indeed the case when i run it locally. It wait for 
>>> the
>>> job to finish before shutting down.
>>>
>>> But when i launch the job in custer with "yarn-cluster" mode, it
>>> abruptly shuts down.
>>> Executor prints following log
>>>
>>> ERROR executor.CoarseGrainedExecutorBackend:
>>> Driver xx.xx.xx.xxx:y disassociated! Shutting down.
>>>
>>>  and then it shuts down. It is not a graceful shutdown.
>>>
>>> Anybody knows how to do it in yarn ?
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>


 --
 Thanks
 Deepak
 www.bigdatabig.com
 www.keosha.net

>>>
>>
>>
>> --
>> Thanks
>> Deepak
>> 

Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Rakesh H (Marketing Platform-BLR)
We are on spark 1.5.1
Above change was to add a shutdown hook.
I am not adding shutdown hook in code, so inbuilt shutdown hook is being
called.
Driver signals that it is going to to graceful shutdown, but executor sees
that Driver is dead and it shuts down abruptly.
Could this issue be related to yarn? I see correct behavior locally. I did
"yarn kill " to kill the job.


On Thu, May 12, 2016 at 12:28 PM Deepak Sharma 
wrote:

> This is happening because spark context shuts down without shutting down
> the ssc first.
> This was behavior till spark 1.4 ans was addressed in later releases.
> https://github.com/apache/spark/pull/6307
>
> Which version of spark are you on?
>
> Thanks
> Deepak
>
> On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> Yes, it seems to be the case.
>> In this case executors should have continued logging values till 300, but
>> they are shutdown as soon as i do "yarn kill .."
>>
>> On Thu, May 12, 2016 at 12:11 PM Deepak Sharma 
>> wrote:
>>
>>> So in your case , the driver is shutting down gracefully , but the
>>> executors are not.
>>> IS this the problem?
>>>
>>> Thanks
>>> Deepak
>>>
>>> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
>>> rakes...@flipkart.com> wrote:
>>>
 Yes, it is set to true.
 Log of driver :

 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
 stop(stopGracefully=true) from shutdown hook
 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
 gracefully
 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
 blocks to be consumed for job generation
 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received 
 blocks to be consumed for job generation

 Log of executor:
 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
 xx.xx.xx.xx:x disassociated! Shutting down.
 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association with 
 remote system [xx.xx.xx.xx:x] has failed, address is now gated for 
 [5000] ms. Reason: [Disassociated]
 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
 204 //This is value i am logging
 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
 205
 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
 206






 On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
 wrote:

> Hi Rakesh
> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to
> true *for your spark configuration instance?
> If not try this , and let us know if this helps.
>
> Thanks
> Deepak
>
> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> Issue i am having is similar to the one mentioned here :
>>
>> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>>
>> I am creating a rdd from sequence of 1 to 300 and creating streaming
>> RDD out of it.
>>
>> val rdd = ssc.sparkContext.parallelize(1 to 300)
>> val dstream = new ConstantInputDStream(ssc, rdd)
>> dstream.foreachRDD{ rdd =>
>>   rdd.foreach{ x =>
>> log(x)
>> Thread.sleep(50)
>>   }
>> }
>>
>>
>> When i kill this job, i expect elements 1 to 300 to be logged before
>> shutting down. It is indeed the case when i run it locally. It wait for 
>> the
>> job to finish before shutting down.
>>
>> But when i launch the job in custer with "yarn-cluster" mode, it
>> abruptly shuts down.
>> Executor prints following log
>>
>> ERROR executor.CoarseGrainedExecutorBackend:
>> Driver xx.xx.xx.xxx:y disassociated! Shutting down.
>>
>>  and then it shuts down. It is not a graceful shutdown.
>>
>> Anybody knows how to do it in yarn ?
>>
>>
>>
>>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>

>>>
>>>
>>> --
>>> Thanks
>>> Deepak
>>> www.bigdatabig.com
>>> www.keosha.net
>>>
>>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>


Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Deepak Sharma
This is happening because spark context shuts down without shutting down
the ssc first.
This was behavior till spark 1.4 ans was addressed in later releases.
https://github.com/apache/spark/pull/6307

Which version of spark are you on?

Thanks
Deepak

On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) <
rakes...@flipkart.com> wrote:

> Yes, it seems to be the case.
> In this case executors should have continued logging values till 300, but
> they are shutdown as soon as i do "yarn kill .."
>
> On Thu, May 12, 2016 at 12:11 PM Deepak Sharma 
> wrote:
>
>> So in your case , the driver is shutting down gracefully , but the
>> executors are not.
>> IS this the problem?
>>
>> Thanks
>> Deepak
>>
>> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
>> rakes...@flipkart.com> wrote:
>>
>>> Yes, it is set to true.
>>> Log of driver :
>>>
>>> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
>>> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
>>> stop(stopGracefully=true) from shutdown hook
>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
>>> gracefully
>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
>>> blocks to be consumed for job generation
>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received 
>>> blocks to be consumed for job generation
>>>
>>> Log of executor:
>>> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
>>> xx.xx.xx.xx:x disassociated! Shutting down.
>>> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association with 
>>> remote system [xx.xx.xx.xx:x] has failed, address is now gated for 
>>> [5000] ms. Reason: [Disassociated]
>>> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 
>>> 204 //This is value i am logging
>>> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 205
>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 206
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
>>> wrote:
>>>
 Hi Rakesh
 Did you tried setting *spark.streaming.stopGracefullyOnShutdown to
 true *for your spark configuration instance?
 If not try this , and let us know if this helps.

 Thanks
 Deepak

 On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
 rakes...@flipkart.com> wrote:

> Issue i am having is similar to the one mentioned here :
>
> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>
> I am creating a rdd from sequence of 1 to 300 and creating streaming
> RDD out of it.
>
> val rdd = ssc.sparkContext.parallelize(1 to 300)
> val dstream = new ConstantInputDStream(ssc, rdd)
> dstream.foreachRDD{ rdd =>
>   rdd.foreach{ x =>
> log(x)
> Thread.sleep(50)
>   }
> }
>
>
> When i kill this job, i expect elements 1 to 300 to be logged before
> shutting down. It is indeed the case when i run it locally. It wait for 
> the
> job to finish before shutting down.
>
> But when i launch the job in custer with "yarn-cluster" mode, it
> abruptly shuts down.
> Executor prints following log
>
> ERROR executor.CoarseGrainedExecutorBackend:
> Driver xx.xx.xx.xxx:y disassociated! Shutting down.
>
>  and then it shuts down. It is not a graceful shutdown.
>
> Anybody knows how to do it in yarn ?
>
>
>
>


 --
 Thanks
 Deepak
 www.bigdatabig.com
 www.keosha.net

>>>
>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>


-- 
Thanks
Deepak
www.bigdatabig.com
www.keosha.net


Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Rakesh H (Marketing Platform-BLR)
Yes, it seems to be the case.
In this case executors should have continued logging values till 300, but
they are shutdown as soon as i do "yarn kill .."

On Thu, May 12, 2016 at 12:11 PM Deepak Sharma 
wrote:

> So in your case , the driver is shutting down gracefully , but the
> executors are not.
> IS this the problem?
>
> Thanks
> Deepak
>
> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> Yes, it is set to true.
>> Log of driver :
>>
>> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
>> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
>> stop(stopGracefully=true) from shutdown hook
>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
>> gracefully
>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
>> blocks to be consumed for job generation
>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received 
>> blocks to be consumed for job generation
>>
>> Log of executor:
>> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
>> xx.xx.xx.xx:x disassociated! Shutting down.
>> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association with 
>> remote system [xx.xx.xx.xx:x] has failed, address is now gated for 
>> [5000] ms. Reason: [Disassociated]
>> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 204 
>> //This is value i am logging
>> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 205
>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 206
>>
>>
>>
>>
>>
>>
>> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
>> wrote:
>>
>>> Hi Rakesh
>>> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to true
>>> *for your spark configuration instance?
>>> If not try this , and let us know if this helps.
>>>
>>> Thanks
>>> Deepak
>>>
>>> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
>>> rakes...@flipkart.com> wrote:
>>>
 Issue i am having is similar to the one mentioned here :

 http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn

 I am creating a rdd from sequence of 1 to 300 and creating streaming
 RDD out of it.

 val rdd = ssc.sparkContext.parallelize(1 to 300)
 val dstream = new ConstantInputDStream(ssc, rdd)
 dstream.foreachRDD{ rdd =>
   rdd.foreach{ x =>
 log(x)
 Thread.sleep(50)
   }
 }


 When i kill this job, i expect elements 1 to 300 to be logged before
 shutting down. It is indeed the case when i run it locally. It wait for the
 job to finish before shutting down.

 But when i launch the job in custer with "yarn-cluster" mode, it
 abruptly shuts down.
 Executor prints following log

 ERROR executor.CoarseGrainedExecutorBackend:
 Driver xx.xx.xx.xxx:y disassociated! Shutting down.

  and then it shuts down. It is not a graceful shutdown.

 Anybody knows how to do it in yarn ?




>>>
>>>
>>> --
>>> Thanks
>>> Deepak
>>> www.bigdatabig.com
>>> www.keosha.net
>>>
>>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>


Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Deepak Sharma
So in your case , the driver is shutting down gracefully , but the
executors are not.
IS this the problem?

Thanks
Deepak

On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) <
rakes...@flipkart.com> wrote:

> Yes, it is set to true.
> Log of driver :
>
> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking 
> stop(stopGracefully=true) from shutdown hook
> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator 
> gracefully
> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received 
> blocks to be consumed for job generation
> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received blocks 
> to be consumed for job generation
>
> Log of executor:
> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> xx.xx.xx.xx:x disassociated! Shutting down.
> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association with 
> remote system [xx.xx.xx.xx:x] has failed, address is now gated for [5000] 
> ms. Reason: [Disassociated]
> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 204 
> //This is value i am logging
> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 205
> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 206
>
>
>
>
>
>
> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
> wrote:
>
>> Hi Rakesh
>> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to true *for
>> your spark configuration instance?
>> If not try this , and let us know if this helps.
>>
>> Thanks
>> Deepak
>>
>> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
>> rakes...@flipkart.com> wrote:
>>
>>> Issue i am having is similar to the one mentioned here :
>>>
>>> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>>>
>>> I am creating a rdd from sequence of 1 to 300 and creating streaming RDD
>>> out of it.
>>>
>>> val rdd = ssc.sparkContext.parallelize(1 to 300)
>>> val dstream = new ConstantInputDStream(ssc, rdd)
>>> dstream.foreachRDD{ rdd =>
>>>   rdd.foreach{ x =>
>>> log(x)
>>> Thread.sleep(50)
>>>   }
>>> }
>>>
>>>
>>> When i kill this job, i expect elements 1 to 300 to be logged before
>>> shutting down. It is indeed the case when i run it locally. It wait for the
>>> job to finish before shutting down.
>>>
>>> But when i launch the job in custer with "yarn-cluster" mode, it
>>> abruptly shuts down.
>>> Executor prints following log
>>>
>>> ERROR executor.CoarseGrainedExecutorBackend:
>>> Driver xx.xx.xx.xxx:y disassociated! Shutting down.
>>>
>>>  and then it shuts down. It is not a graceful shutdown.
>>>
>>> Anybody knows how to do it in yarn ?
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Thanks
>> Deepak
>> www.bigdatabig.com
>> www.keosha.net
>>
>


-- 
Thanks
Deepak
www.bigdatabig.com
www.keosha.net


Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Rakesh H (Marketing Platform-BLR)
Yes, it is set to true.
Log of driver :

16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking
stop(stopGracefully=true) from shutdown hook
16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator gracefully
16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all
received blocks to be consumed for job generation
16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received
blocks to be consumed for job generation

Log of executor:
16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver
xx.xx.xx.xx:x disassociated! Shutting down.
16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association
with remote system [xx.xx.xx.xx:x] has failed, address is now
gated for [5000] ms. Reason: [Disassociated]
16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called
16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE
-> 204 //This is value i am logging
16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called
16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 205
16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -> 206






On Thu, May 12, 2016 at 11:45 AM Deepak Sharma 
wrote:

> Hi Rakesh
> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to true *for
> your spark configuration instance?
> If not try this , and let us know if this helps.
>
> Thanks
> Deepak
>
> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
> rakes...@flipkart.com> wrote:
>
>> Issue i am having is similar to the one mentioned here :
>>
>> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>>
>> I am creating a rdd from sequence of 1 to 300 and creating streaming RDD
>> out of it.
>>
>> val rdd = ssc.sparkContext.parallelize(1 to 300)
>> val dstream = new ConstantInputDStream(ssc, rdd)
>> dstream.foreachRDD{ rdd =>
>>   rdd.foreach{ x =>
>> log(x)
>> Thread.sleep(50)
>>   }
>> }
>>
>>
>> When i kill this job, i expect elements 1 to 300 to be logged before
>> shutting down. It is indeed the case when i run it locally. It wait for the
>> job to finish before shutting down.
>>
>> But when i launch the job in custer with "yarn-cluster" mode, it abruptly
>> shuts down.
>> Executor prints following log
>>
>> ERROR executor.CoarseGrainedExecutorBackend:
>> Driver xx.xx.xx.xxx:y disassociated! Shutting down.
>>
>>  and then it shuts down. It is not a graceful shutdown.
>>
>> Anybody knows how to do it in yarn ?
>>
>>
>>
>>
>
>
> --
> Thanks
> Deepak
> www.bigdatabig.com
> www.keosha.net
>


Re: Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Deepak Sharma
Hi Rakesh
Did you tried setting *spark.streaming.stopGracefullyOnShutdown to true *for
your spark configuration instance?
If not try this , and let us know if this helps.

Thanks
Deepak

On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) <
rakes...@flipkart.com> wrote:

> Issue i am having is similar to the one mentioned here :
>
> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
>
> I am creating a rdd from sequence of 1 to 300 and creating streaming RDD
> out of it.
>
> val rdd = ssc.sparkContext.parallelize(1 to 300)
> val dstream = new ConstantInputDStream(ssc, rdd)
> dstream.foreachRDD{ rdd =>
>   rdd.foreach{ x =>
> log(x)
> Thread.sleep(50)
>   }
> }
>
>
> When i kill this job, i expect elements 1 to 300 to be logged before
> shutting down. It is indeed the case when i run it locally. It wait for the
> job to finish before shutting down.
>
> But when i launch the job in custer with "yarn-cluster" mode, it abruptly
> shuts down.
> Executor prints following log
>
> ERROR executor.CoarseGrainedExecutorBackend:
> Driver xx.xx.xx.xxx:y disassociated! Shutting down.
>
>  and then it shuts down. It is not a graceful shutdown.
>
> Anybody knows how to do it in yarn ?
>
>
>
>


-- 
Thanks
Deepak
www.bigdatabig.com
www.keosha.net


Graceful shutdown of spark streaming on yarn

2016-05-12 Thread Rakesh H (Marketing Platform-BLR)
Issue i am having is similar to the one mentioned here :
http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn

I am creating a rdd from sequence of 1 to 300 and creating streaming RDD
out of it.

val rdd = ssc.sparkContext.parallelize(1 to 300)
val dstream = new ConstantInputDStream(ssc, rdd)
dstream.foreachRDD{ rdd =>
  rdd.foreach{ x =>
log(x)
Thread.sleep(50)
  }
}


When i kill this job, i expect elements 1 to 300 to be logged before
shutting down. It is indeed the case when i run it locally. It wait for the
job to finish before shutting down.

But when i launch the job in custer with "yarn-cluster" mode, it abruptly
shuts down.
Executor prints following log

ERROR executor.CoarseGrainedExecutorBackend:
Driver xx.xx.xx.xxx:y disassociated! Shutting down.

 and then it shuts down. It is not a graceful shutdown.

Anybody knows how to do it in yarn ?