Repository: spark Updated Branches: refs/heads/branch-1.6 d6d31815f -> f7c6c95f9
[SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD tdas koeninger This updates the Spark Streaming + Kafka Integration Guide doc with a working method to access the offsets of a `KafkaRDD` through Python. Author: Nick Evans <m...@nicolasevans.org> Closes #9289 from manygrams/update_kafka_direct_python_docs. (cherry picked from commit dd77e278b99e45c20fdefb1c795f3c5148d577db) Signed-off-by: Tathagata Das <tathagata.das1...@gmail.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f7c6c95f Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f7c6c95f Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f7c6c95f Branch: refs/heads/branch-1.6 Commit: f7c6c95f92828d601bf8a582e3fed61bf98c068d Parents: d6d3181 Author: Nick Evans <m...@nicolasevans.org> Authored: Wed Nov 11 13:29:30 2015 -0800 Committer: Tathagata Das <tathagata.das1...@gmail.com> Committed: Wed Nov 11 13:29:39 2015 -0800 ---------------------------------------------------------------------- docs/streaming-kafka-integration.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/f7c6c95f/docs/streaming-kafka-integration.md ---------------------------------------------------------------------- diff --git a/docs/streaming-kafka-integration.md b/docs/streaming-kafka-integration.md index ab7f011..b00351b 100644 --- a/docs/streaming-kafka-integration.md +++ b/docs/streaming-kafka-integration.md @@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application. ); </div> <div data-lang="python" markdown="1"> - Not supported yet + offsetRanges = [] + + def storeOffsetRanges(rdd): + global offsetRanges + offsetRanges = rdd.offsetRanges() + return rdd + + def printOffsetRanges(rdd): + for o in offsetRanges: + print "%s %s %s %s" % (o.topic, o.partition, o.fromOffset, o.untilOffset) + + directKafkaStream\ + .transform(storeOffsetRanges)\ + .foreachRDD(printOffsetRanges) </div> </div> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org