I guess what you mean is not streaming. If you create a stream context at time t, you will receive data coming through starting time t++, not before time t.
Looks like you want a queue. Let Kafka write to a queue, consume msgs from the queue and stop when queue is empty. On 29 Apr 2015 14:35, "dgoldenberg" <dgoldenberg...@gmail.com> wrote: > Hi, > > I'm wondering about the use-case where you're not doing continuous, > incremental streaming of data out of Kafka but rather want to publish data > once with your Producer(s) and consume it once, in your Consumer, then > terminate the consumer Spark job. > > JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, > Durations.milliseconds(...)); > > The batchDuration parameter is "The time interval at which streaming data > will be divided into batches". Can this be worked somehow to cause Spark > Streaming to just get all the available data, then let all the RDD's within > the Kafka discretized stream get processed, and then just be done and > terminate, rather than wait another period and try and process any more > data > from Kafka? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-stream-all-data-out-of-a-Kafka-topic-once-then-terminate-job-tp22698.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >