Re: running Spark Streaming just once and stop it

Tathagata Das Thu, 13 Feb 2014 11:47:49 -0800

You could do couple of things.

1. You can explicitly call streamingContext.stop() when the first iteration
is over. To detect whether the first iteration is over, you can use the
StreamingListener<http://spark.incubator.apache.org/docs/latest/api/streaming/index.html#org.apache.spark.streaming.scheduler.StreamingListener>interface.
You can create your own streamingListener object and attach it
to a context using StreamingContext.addStreamingListener(<your
streaminglistener
object>)<http://spark.incubator.apache.org/docs/latest/api/streaming/index.html#org.apache.spark.streaming.StreamingContext>.
The functions in the interface will be called every time a batch is
completed.

2. If you just want to test, you could use the
StreamingContext.queueStream, which takes a queue of RDDs, and any RDD
pushed into the queue will be considered as data for a stream. So you can
manually create an RDD with your input data and push it into the queue in
the controlled manner for your testing.

TD

On Tue, Feb 11, 2014 at 3:59 AM, Kal El <pinu.datri...@yahoo.com> wrote:

> I have managed to run the wordcount example with your help (thank you guys
> !)
>
> Now my question is the following: How do I make the job to run only one
> iteration ?
> I am currently running the wordcount example with a local file that does
> not change, so after I have my results in the console (after the first
> iteration that is)  I would like the program to stop and exit automatically
> (and not by using Ctrl+C).
>
> I believe this can be done by triggering the awaitTermination() function,
> but how can I do that ?
>
> Thanks
>

Re: running Spark Streaming just once and stop it

Reply via email to