[ https://issues.apache.org/jira/browse/SPARK-33833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258692#comment-17258692 ]
L. C. Hsieh edited comment on SPARK-33833 at 1/5/21, 6:27 AM: -------------------------------------------------------------- Hmm, I did a few test locally. Does Burrow work only if Spark commits offset progress back to Kafka? I added some code to commit offset progress to Kafka. After I checked "__consumer_offsets" topic of Kafka, I found that no matter Spark commits the progress to Kafka or not, the record of the consumer group of the Spark SS query is always in "__consumer_offsets". Based on https://github.com/linkedin/Burrow/wiki, Burrow checks consumer groups info from this "__consumer_offsets" topic. So if either Spark commits or not, there will be a record about the consumer group, does it mean Burrow still works without Spark committing offset progress to Kafka? If so, then Spark doesn't need any change for this ticket. was (Author: viirya): Hmm, I did a few test locally. Does Burrow work only if Spark commits offset progress back to Kafka? I added some code to commit offset progress to Kafka. After I checked "__consumer_offsets" topic of Kafka, I found that no matter Spark commits the progress to Kafka or not, the record of the consumer group of the Spark SS query is always in "__consumer_offsets". Based on https://github.com/linkedin/Burrow/wiki, Burrow checks consumer groups info from this "__consumer_offsets" topic. So if either Spark commits or not, there will be a record about the consumer group, does it mean Burrow still works without Spark committing offset progress to Kafka? > Allow Spark Structured Streaming report Kafka Lag through Burrow > ---------------------------------------------------------------- > > Key: SPARK-33833 > URL: https://issues.apache.org/jira/browse/SPARK-33833 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 3.0.1 > Reporter: Sam Davarnia > Priority: Major > > Because structured streaming tracks Kafka offset consumption by itself, > It is not possible to track total Kafka lag using Burrow similar to DStreams > We have used Stream hooks as mentioned > [here|https://medium.com/@ronbarabash/how-to-measure-consumer-lag-in-spark-structured-streaming-6c3645e45a37] > > It would be great if Spark supports this feature out of the box. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org