[ https://issues.apache.org/jira/browse/SPARK-9059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049116#comment-15049116 ]
Benjamin Fradet edited comment on SPARK-9059 at 12/10/15 6:49 AM: ------------------------------------------------------------------ There is a python code snipped like the java and scala ones in the docs on master [here|https://github.com/apache/spark/blob/master/docs/streaming-kafka-integration.md#approach-2-direct-approach-no-receivers]. However, my understanding was that this wasn't the point of this jira. As I understood it, it was originally to incorporate in the code examples, or duplicate into a new example, the use of `HasOffsetRanges`. was (Author: benfradet): There is a python code snipped like the java and scala ones in the docs on master [here|https://github.com/apache/spark/blob/master/docs/streaming-kafka-integration.md#approach-2-direct-approach-no-receivers]. However, my understanding was that this wasn't the point of this jira. As I understood it, it was originally to incorporate in the code examples, or duplicate into a new example, the use of `HasOffsetRanges` like the [scala one|https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/DirectKafkaWordCount.scala]. > Update Python Direct Kafka Word count examples to show the use of > HasOffsetRanges > --------------------------------------------------------------------------------- > > Key: SPARK-9059 > URL: https://issues.apache.org/jira/browse/SPARK-9059 > Project: Spark > Issue Type: Sub-task > Components: Streaming > Reporter: Tathagata Das > Priority: Minor > Labels: starter > > Update Python examples of Direct Kafka word count to access the offset ranges > using HasOffsetRanges and print it. For example in Scala, > > {code} > var offsetRanges: Array[OffsetRange] = _ > ... > directKafkaDStream.foreachRDD { rdd => > offsetRanges = rdd.asInstanceOf[HasOffsetRanges] > } > ... > transformedDStream.foreachRDD { rdd => > // some operation > println("Processed ranges: " + offsetRanges) > } > {code} > See https://spark.apache.org/docs/latest/streaming-kafka-integration.html for > more info, and the master source code for more updated information on python. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org