Re: [streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints?

2015-12-07 Thread Cody Koeninger
Just to be clear, spark checkpoints have nothing to do with zookeeper, they're stored in the filesystem you specify. On Sun, Dec 6, 2015 at 1:25 AM, manasdebashiskar wrote: > When you enable check pointing your offsets get written in zookeeper. If > you > program dies or

Re: [streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints?

2015-12-05 Thread manasdebashiskar
When you enable check pointing your offsets get written in zookeeper. If you program dies or shutdowns and later restarted kafkadirectstream api knows where to start by looking at those offsets from zookeeper. This is as easy as it gets. However if you are planning to re-use the same checkpoint

[streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints?

2015-11-24 Thread ponkin
HI, When I create stream with KafkaUtils.createDirectStream I can explicitly define the position "largest" or "smallest" - where to read topic from. What if I have previous checkpoints( in HDFS for example) with offsets, and I want to start reading from the last checkpoint? In source code of

Re: [streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints?

2015-11-24 Thread Deng Ching-Mallete
ets in external > datastore? > > Alexey Ponkin > > -- > View this message in context: [streaming] KafkaUtils.createDirectStream - > how to start streming from checkpoints? > <http://apache-spark-user-list.1001560.n3.nabble.com/streaming-KafkaUtils-cre

Re: [streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints?

2015-11-24 Thread Понькин Алексей
park to start reading from saved offesets(in checkpoints)? >> Is it possible at all or I need to store offsets in external datastore? >> >> Alexey Ponkin >> >> >> View this message in context: [streaming] KafkaUtils.createDirectStream - >