[ 
https://issues.apache.org/jira/browse/SPARK-33833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17251925#comment-17251925
 ] 

L. C. Hsieh commented on SPARK-33833:
-------------------------------------

Hmm, as SS commits source offsets in external storage by SS itself, it doesn't 
commit source offsets back to Kafka. I think we can do a few things here:

1. Expose latest offsets of sources in StreamingQueryProgress. So users can 
track the log by themselves. But as this is not actually committed back to 
Kafka, users still cannot use Burrow to track it.
2. Commit source offset backs to Kafka to a dummy consumer group in 
KafkaSource. So users can track the lag using Burrow.






> Allow Spark Structured Streaming report Kafka Lag through Burrow
> ----------------------------------------------------------------
>
>                 Key: SPARK-33833
>                 URL: https://issues.apache.org/jira/browse/SPARK-33833
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.0.1
>            Reporter: Sam Davarnia
>            Priority: Major
>
> Because structured streaming tracks Kafka offset consumption by itself, 
> It is not possible to track total Kafka lag using Burrow similar to DStreams
> We have used Stream hooks as mentioned 
> [here|https://medium.com/@ronbarabash/how-to-measure-consumer-lag-in-spark-structured-streaming-6c3645e45a37]
>  
> It would be great if Spark supports this feature out of the box.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to