[ 
https://issues.apache.org/jira/browse/STORM-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281056#comment-15281056
 ] 

Jungtaek Lim commented on STORM-1136:
-------------------------------------

I didn't notice this is filed to JIRA issue but have been thinking same thing 
(on UI).

Btw, I just talked with Priyank about this issue and think it would be better 
to share my thoughts around this issue.

- I guess Kafka is de-facto standard of data source for Storm, but still not be 
the first class. So storm-core shouldn't be coupled with kafka client, or 
discussion should be made first.
- It means Kafka Spout should provide those information to Nimbus so that UI 
can query to Nimbus via RPC.
- I was thinking about including partition information to heartbeat on spout 
task, but I guess they're rather big for heartbeat message.
-- If it doesn't affect performance or ZK load, it would be the easiest way to 
implement.
- When providing partition information, data structure should be general so 
that Nimbus can parse them without coupling with kafka client.
-- Spark introduces similar feature and it stores offset information with 
generalized data structure (StreamInputInfo): 
https://github.com/apache/spark/pull/7081

> Provide a bin script to check consumer lag from KafkaSpout to Kafka topic 
> offsets
> ---------------------------------------------------------------------------------
>
>                 Key: STORM-1136
>                 URL: https://issues.apache.org/jira/browse/STORM-1136
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka
>            Reporter: Sriharsha Chintalapani
>            Assignee: Priyank Shah
>
> We store kafkaspout offsets in zkroot + id path in zookeeper. Kafka provides 
> a utility and a protocol request to fetch latest offsets into topic
> {code}
> example:
> bin/kafka-run-classh.sh kafka.tools.GetOffsetTool 
> {code}
> we should provide a way for the user to check how far the kafka spout read 
> into topic and whats the lag. If we can expose this via UI even better.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to