Spark Streaming 1.5.2+Kafka+Python (docs)

2015-12-23 Thread Vyacheslav Yanuk
Colleagues
Documents written about  createDirectStream that

"This does not use Zookeeper to store offsets. The consumed offsets are
tracked by the stream itself. For interoperability with Kafka monitoring
tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself
from the streaming application. You can access the offsets used in each
batch from the generated RDDs (see   "

My question is.
How I can access the offsets used in each batch ???
What I should SEE???

-- 
WBR, Vyacheslav Yanuk
Codeminders.com


Re: Spark Streaming 1.5.2+Kafka+Python (docs)

2015-12-23 Thread Cody Koeninger
Read the documentation
spark.apache.org/docs/latest/streaming-kafka-integration.html
If you still have questions, read the resources linked from
https://github.com/koeninger/kafka-exactly-once

On Wed, Dec 23, 2015 at 7:24 AM, Vyacheslav Yanuk 
wrote:

> Colleagues
> Documents written about  createDirectStream that
>
> "This does not use Zookeeper to store offsets. The consumed offsets are
> tracked by the stream itself. For interoperability with Kafka monitoring
> tools that depend on Zookeeper, you have to update Kafka/Zookeeper yourself
> from the streaming application. You can access the offsets used in each
> batch from the generated RDDs (see   "
>
> My question is.
> How I can access the offsets used in each batch ???
> What I should SEE???
>
> --
> WBR, Vyacheslav Yanuk
> Codeminders.com
>