[ 
https://issues.apache.org/jira/browse/KAFKA-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14952795#comment-14952795
 ] 

Sreenivasulu Nallapati commented on KAFKA-2565:
-----------------------------------------------

This happens basically in this scenario.
Our batch works as below
1. Even though we are running our consumer in batch mode, we are using High 
Level Consumer for scalability.
2. Opens kafka consumer connector
3. Start of the consumer  it will identify the latest offset(consumer stop 
offset -cso) for each partition
4. It will start reading messages till the cso for this batch.
5. Write the messages to a temp files on FTP server. Once we process all the 
messages, move the FTP server temp files to actual location on FTP server
6. If all data transfer is success, commit the offset to zookeeper.
7. If we run Single consumer to process all the partitions, there is no issue :)
8. If we start multiple consumers for a single topic ( say three consumers, 
three partitions. one consumer for one partition) the problem starts.
What we observed here is: Out of three consumers if one consumer(c1-partition1) 
finishes its processing ahead of other two. The zookeeper sees a re balancing 
and start re balancing partition1 with one of other two running consumers(while 
zookeeper doing this task other consumers consumed all the messages and in the 
process of moving the temp files on FTP server). We are not closing the 
consumer connector till end of the batch. The re balancing is happening after 
we stopped consuming the message.

Is there something we are missing here or doing wrong




> Offset Commit is not working if multiple consumers try to commit the offset
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-2565
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2565
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 0.8.1, 0.8.2.1, 0.8.2.2
>            Reporter: Sreenivasulu Nallapati
>            Assignee: Neha Narkhede
>
> We are seeing some strange behaviour with commitOffsets() method of 
> kafka.javaapi.consumer.ConsumerConnector. We committing the offsets to 
> zookeeper at the end of the consumer batch. We are running multiple consumers 
> for the same topic.
> Test details: 
> 1.    Created a topic with three partitions
> 2.    Started three consumers (cronjob) at the same time. The aim is that 
> each consumer to process one partition.
> 3.    Each consumer at the end of the batch, it will call the commitOffsets() 
> method on kafka.javaapi.consumer.ConsumerConnector
> 4.    The offsets are getting properly updated in zookeeper if we run the 
> consumers for small set (say 1000 messages) of messages.
> 5.    But for larger number of messages, commit offset is not working as 
> expected…sometimes only two offsets are properly committing and other one 
> remains as it was.
> 6.    Please see the below example
> Partition: 0 Latest Offset: 1057585
> Partition: 1 Latest Offset: 1057715
> Partition: 2 Latest Offset: 1057590
> Earliest Offset after all consumers completed: {0=1057585, 1=724375, 
> 2=1057590}
> Highlighted in red supposed to be committed as 1057715 but it did not.
> Please check if it is bug with multiple consumers. When multiple consumers 
> are trying to update the same path in Zookeper, is there any synchronization 
> issue?
> Kafka Cluster details
> 1 zookeeper
> 3 brokers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to