How to stream all data out of a Kafka topic once, then terminate job?

dgoldenberg Tue, 28 Apr 2015 21:35:45 -0700

Hi,

I'm wondering about the use-case where you're not doing continuous,
incremental streaming of data out of Kafka but rather want to publish data
once with your Producer(s) and consume it once, in your Consumer, then
terminate the consumer Spark job.


JavaStreamingContext jssc = new JavaStreamingContext(sparkConf,
Durations.milliseconds(...));

The batchDuration parameter is "The time interval at which streaming data
will be divided into batches". Can this be worked somehow to cause Spark
Streaming to just get all the available data, then let all the RDD's within
the Kafka discretized stream get processed, and then just be done and
terminate, rather than wait another period and try and process any more data
from Kafka?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-stream-all-data-out-of-a-Kafka-topic-once-then-terminate-job-tp22698.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

How to stream all data out of a Kafka topic once, then terminate job?

Reply via email to