Rakesh Have you used awaitTermination() on your ssc ? If not , dd this and see if it changes the behavior. I am guessing this issue may be related to yarn deployment mode. Also try setting the deployment mode to yarn-client.
Thanks Deepak On Fri, May 13, 2016 at 10:17 AM, Rakesh H (Marketing Platform-BLR) < rakes...@flipkart.com> wrote: > Ping!! > Has anybody tested graceful shutdown of a spark streaming in yarn-cluster > mode?It looks like a defect to me. > > > On Thu, May 12, 2016 at 12:53 PM Rakesh H (Marketing Platform-BLR) < > rakes...@flipkart.com> wrote: > >> We are on spark 1.5.1 >> Above change was to add a shutdown hook. >> I am not adding shutdown hook in code, so inbuilt shutdown hook is being >> called. >> Driver signals that it is going to to graceful shutdown, but executor >> sees that Driver is dead and it shuts down abruptly. >> Could this issue be related to yarn? I see correct behavior locally. I >> did "yarn kill ...." to kill the job. >> >> >> On Thu, May 12, 2016 at 12:28 PM Deepak Sharma <deepakmc...@gmail.com> >> wrote: >> >>> This is happening because spark context shuts down without shutting down >>> the ssc first. >>> This was behavior till spark 1.4 ans was addressed in later releases. >>> https://github.com/apache/spark/pull/6307 >>> >>> Which version of spark are you on? >>> >>> Thanks >>> Deepak >>> >>> On Thu, May 12, 2016 at 12:14 PM, Rakesh H (Marketing Platform-BLR) < >>> rakes...@flipkart.com> wrote: >>> >>>> Yes, it seems to be the case. >>>> In this case executors should have continued logging values till 300, >>>> but they are shutdown as soon as i do "yarn kill ......" >>>> >>>> On Thu, May 12, 2016 at 12:11 PM Deepak Sharma <deepakmc...@gmail.com> >>>> wrote: >>>> >>>>> So in your case , the driver is shutting down gracefully , but the >>>>> executors are not. >>>>> IS this the problem? >>>>> >>>>> Thanks >>>>> Deepak >>>>> >>>>> On Thu, May 12, 2016 at 11:49 AM, Rakesh H (Marketing Platform-BLR) < >>>>> rakes...@flipkart.com> wrote: >>>>> >>>>>> Yes, it is set to true. >>>>>> Log of driver : >>>>>> >>>>>> 16/05/12 10:18:29 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: >>>>>> SIGTERM >>>>>> 16/05/12 10:18:29 INFO streaming.StreamingContext: Invoking >>>>>> stop(stopGracefully=true) from shutdown hook >>>>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Stopping JobGenerator >>>>>> gracefully >>>>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waiting for all received >>>>>> blocks to be consumed for job generation >>>>>> 16/05/12 10:18:29 INFO scheduler.JobGenerator: Waited for all received >>>>>> blocks to be consumed for job generation >>>>>> >>>>>> Log of executor: >>>>>> 16/05/12 10:18:29 ERROR executor.CoarseGrainedExecutorBackend: Driver >>>>>> xx.xx.xx.xx:xxxxx disassociated! Shutting down. >>>>>> 16/05/12 10:18:29 WARN remote.ReliableDeliverySupervisor: Association >>>>>> with remote system [xx.xx.xx.xx:xxxxx] has failed, address is now gated >>>>>> for [5000] ms. Reason: [Disassociated] >>>>>> 16/05/12 10:18:29 INFO storage.DiskBlockManager: Shutdown hook called >>>>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -------------> >>>>>> 204 //This is value i am logging >>>>>> 16/05/12 10:18:29 INFO util.ShutdownHookManager: Shutdown hook called >>>>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -------------> >>>>>> 205 >>>>>> 16/05/12 10:18:29 INFO processors.StreamJobRunner$: VALUE -------------> >>>>>> 206 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, May 12, 2016 at 11:45 AM Deepak Sharma <deepakmc...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Rakesh >>>>>>> Did you tried setting *spark.streaming.stopGracefullyOnShutdown to >>>>>>> true *for your spark configuration instance? >>>>>>> If not try this , and let us know if this helps. >>>>>>> >>>>>>> Thanks >>>>>>> Deepak >>>>>>> >>>>>>> On Thu, May 12, 2016 at 11:42 AM, Rakesh H (Marketing Platform-BLR) >>>>>>> <rakes...@flipkart.com> wrote: >>>>>>> >>>>>>>> Issue i am having is similar to the one mentioned here : >>>>>>>> >>>>>>>> http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn >>>>>>>> >>>>>>>> I am creating a rdd from sequence of 1 to 300 and creating >>>>>>>> streaming RDD out of it. >>>>>>>> >>>>>>>> val rdd = ssc.sparkContext.parallelize(1 to 300) >>>>>>>> val dstream = new ConstantInputDStream(ssc, rdd) >>>>>>>> dstream.foreachRDD{ rdd => >>>>>>>> rdd.foreach{ x => >>>>>>>> log(x) >>>>>>>> Thread.sleep(50) >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> When i kill this job, i expect elements 1 to 300 to be logged >>>>>>>> before shutting down. It is indeed the case when i run it locally. It >>>>>>>> wait >>>>>>>> for the job to finish before shutting down. >>>>>>>> >>>>>>>> But when i launch the job in custer with "yarn-cluster" mode, it >>>>>>>> abruptly shuts down. >>>>>>>> Executor prints following log >>>>>>>> >>>>>>>> ERROR executor.CoarseGrainedExecutorBackend: >>>>>>>> Driver xx.xx.xx.xxx:yyyyy disassociated! Shutting down. >>>>>>>> >>>>>>>> and then it shuts down. It is not a graceful shutdown. >>>>>>>> >>>>>>>> Anybody knows how to do it in yarn ? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks >>>>>>> Deepak >>>>>>> www.bigdatabig.com >>>>>>> www.keosha.net >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks >>>>> Deepak >>>>> www.bigdatabig.com >>>>> www.keosha.net >>>>> >>>> >>> >>> >>> -- >>> Thanks >>> Deepak >>> www.bigdatabig.com >>> www.keosha.net >>> >> -- Thanks Deepak www.bigdatabig.com www.keosha.net