Re: stopping spark stream app

2016-01-29 Thread agateaaa
Hi, We recently started working on trying to use spark streaming to fetch and process data from kafka. (Direct Streaming, Not Receiver based Spark 1.5.2) We want to be able to stop the streaming application and tried implementing the approach suggested above, using stopping thread and calling

Re: stopping spark stream app

2015-08-12 Thread Tathagata Das
stop() is a blocking method when stopGraceful is set to true. In that case, it obviously waits for all batches with data to complete processing. Why are you joining on the thread in streaming listener? The listener is just a callback listener and is NOT supposed to do any long running blocking

Re: stopping spark stream app

2015-08-12 Thread Shushant Arora
calling jssc.stop(false/true,false/true) from streamingListener causes deadlock , So I created another thread and called jssc.stop from that but that too caused deadlock if onBatchCompleted is not completed before jssc.stop(). So is it safe If I call System.exit(1) from another thread without

Re: stopping spark stream app

2015-08-12 Thread Tathagata Das
Well, system.exit will not ensure all data was processed before shutdown. There should not be a deadlock is onBatchCompleted just starts the thread (that runs stop()) and completes. On Wed, Aug 12, 2015 at 1:50 AM, Shushant Arora shushantaror...@gmail.com wrote: calling

Re: stopping spark stream app

2015-08-10 Thread Shushant Arora
Any help in best recommendation for gracefully shutting down a spark stream application ? I am running it on yarn and a way to tell from externally either yarn application -kill command or some other way but need current batch to be processed completely and checkpoint to be saved before shutting

Re: stopping spark stream app

2015-08-10 Thread Shushant Arora
Thanks! On Tue, Aug 11, 2015 at 1:34 AM, Tathagata Das t...@databricks.com wrote: 1. RPC can be done in many ways, and a web service is one of many ways. A even more hacky version can be the app polling a file in a file system, if the file exists start shutting down. 2. No need to set a

Re: stopping spark stream app

2015-08-10 Thread Tathagata Das
In general, it is a little risky to put long running stuff in a shutdown hook as it may delay shutdown of the process which may delay other things. That said, you could try it out. A better way to explicitly shutdown gracefully is to use an RPC to signal the driver process to start shutting down,

Re: stopping spark stream app

2015-08-10 Thread Shushant Arora
By RPC you mean web service exposed on driver which listens and set some flag and driver checks that flag at end of each batch and if set then gracefully stop the application ? On Tue, Aug 11, 2015 at 12:43 AM, Tathagata Das t...@databricks.com wrote: In general, it is a little risky to put

Re: stopping spark stream app

2015-08-10 Thread Tathagata Das
1. RPC can be done in many ways, and a web service is one of many ways. A even more hacky version can be the app polling a file in a file system, if the file exists start shutting down. 2. No need to set a flag. When you get the signal from RPC, you can just call context.stop(stopGracefully =

Re: stopping spark stream app

2015-08-09 Thread Shushant Arora
Hi How to ensure in spark streaming 1.3 with kafka that when an application is killed , last running batch is fully processed and offsets are written to checkpointing dir. On Fri, Aug 7, 2015 at 8:56 AM, Shushant Arora shushantaror...@gmail.com wrote: Hi I am using spark stream 1.3 and using

stopping spark stream app

2015-08-06 Thread Shushant Arora
Hi I am using spark stream 1.3 and using custom checkpoint to save kafka offsets. 1.Is doing Runtime.getRuntime().addShutdownHook(new Thread() { @Override public void run() { jssc.stop(true, true); System.out.println(Inside Add Shutdown Hook); } }); to handle stop is safe ? 2.And I