Thank you, Tathagata, Cody, Otis. - Dmitry
On Mon, Jun 1, 2015 at 6:57 PM, Otis Gospodnetic <otis.gospodne...@gmail.com > wrote: > I think you can use SPM - http://sematext.com/spm - it will give you all > Spark and all Kafka metrics, including offsets broken down by topic, etc. > out of the box. I see more and more people using it to monitor various > components in data processing pipelines, a la > http://blog.sematext.com/2015/04/22/monitoring-stream-processing-tools-cassandra-kafka-and-spark/ > > Otis > > On Mon, Jun 1, 2015 at 5:23 PM, dgoldenberg <dgoldenberg...@gmail.com> > wrote: > >> Hi, >> >> What are some of the good/adopted approached to monitoring Spark Streaming >> from Kafka? I see that there are things like >> http://quantifind.github.io/KafkaOffsetMonitor, for example. Do they all >> assume that Receiver-based streaming is used? >> >> Then "Note that one disadvantage of this approach (Receiverless Approach, >> #2) is that it does not update offsets in Zookeeper, hence Zookeeper-based >> Kafka monitoring tools will not show progress. However, you can access the >> offsets processed by this approach in each batch and update Zookeeper >> yourself". >> >> The code sample, however, seems sparse. What do you need to do here? - >> directKafkaStream.foreachRDD( >> new Function<JavaPairRDD<String, String>, Void>() { >> @Override >> public Void call(JavaPairRDD<String, Integer> rdd) throws >> IOException { >> OffsetRange[] offsetRanges = >> ((HasOffsetRanges)rdd).offsetRanges >> // offsetRanges.length = # of Kafka partitions being consumed >> ... >> return null; >> } >> } >> ); >> >> and if these are updated, will KafkaOffsetMonitor work? >> >> Monitoring seems to center around the notion of a consumer group. But in >> the receiverless approach, code on the Spark consumer side doesn't seem to >> expose a consumer group parameter. Where does it go? Can I/should I just >> pass in group.id as part of the kafkaParams HashMap? >> >> Thanks >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-monitor-Spark-Streaming-from-Kafka-tp23103.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >