[ 
https://issues.apache.org/jira/browse/SPARK-17815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15560906#comment-15560906
 ] 

Ofir Manor commented on SPARK-17815:
------------------------------------

I totally understand your concerns, and yes, I hope you will continue to 
faithfully respond to all Kafka-related messages, and yes, you should insist on 
fixing what you think broken or unclear or too complex.
I just don't see how it relates to this ticket, which is explicitly about 
helping external monitoring tools that might rely on the Kafka consumer group 
advancing its offsets.
You have some problems with the framework, specifically with its fault 
tolerance semantics, not with some specific Kafka source implementation detail:
   
https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#fault-tolerance-semantics
Since Kafka might be the first  or second "real" streaming source, it does make 
a lot of sense to validate / improve the framework based on this feedback. I 
think this ticket is the wrong place for this - it will have no visiblity. 
Maybe that should go the dev list, or to the blocking ticket (which is about 
the framework), or a new Structure Streaming JIRA (if there is a specific 
suggestion), or maybe to a SEP/SPIP, or if no committer is willing to revisit 
it or discuss it in any channel (as was with the Kafka source for many months), 
you'll eventually have to face that this is how this project is unfortunately 
managed.
Anyway, I think I maxed out my contribution to this thread. It should be 
continued with the right committers, wherever they want, not with me.

> Report committed offsets
> ------------------------
>
>                 Key: SPARK-17815
>                 URL: https://issues.apache.org/jira/browse/SPARK-17815
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Michael Armbrust
>
> Since we manage our own offsets, we have turned off auto-commit.  However, 
> this means that external tools are not able to report on how far behind a 
> given streaming job is.  When the user manually gives us a group.id, we 
> should report back to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to