[
https://issues.apache.org/jira/browse/SAMZA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281068#comment-14281068
]
Joel Koshy commented on SAMZA-503:
----------------------------------
To add to what Guozhang wrote - the Kafka broker replica followers set the
consumerId to their brokerId which is non-negative. This allows them to access
offsets that are past the last committed message. Regular consumers (which is
your case) should set it to kafka.api.Request.OrdinaryConsumerId. Otherwise
(say, if you use hashcode as I noticed in an earlier comment) you could be
exposed to uncommitted messages.
> Lag gauge very slow to update for slow jobs
> -------------------------------------------
>
> Key: SAMZA-503
> URL: https://issues.apache.org/jira/browse/SAMZA-503
> Project: Samza
> Issue Type: Bug
> Components: metrics
> Affects Versions: 0.8.0
> Environment: Mac OS X, Oracle Java 7, ProcessJobFactory
> Reporter: Roger Hoover
> Assignee: Yan Fang
> Fix For: 0.9.0
>
> Attachments: SAMZA-503.patch
>
>
> For slow jobs, the
> KafkaSystemConsumerMetrics.%s-%s-messages-behind-high-watermark) gauge does
> not get updated very often.
> To reproduce:
> * Create a job that processes one message and sleeps for 5 seconds
> * Create it's input topic but do not populate it yet
> * Start the job
> * Load 1000s of messages to it's input topic. You can keep adding messages
> with a "wait -n 1 <kafka console producer command>"
> What happens:
> * Run jconsole to view the JMX metrics
> * The %s-%s-messages-behind-high-watermark gauge will stay at 0 for a LONG
> time (~10 minutes?) before finally updating.
> What should happen:
> * The gauge should get updated at a reasonable interval (a least every few
> seconds)
> I think what's happening is that the BrokerProxy only updates the high
> watermark when a consumer is ready for more messages. When the job is so
> slow, this rarely happens to the metric doesn't get updated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)