John Fung created KAFKA-372:
-------------------------------
Summary: Consumer doesn't receive all data if there are multiple
segment files
Key: KAFKA-372
URL: https://issues.apache.org/jira/browse/KAFKA-372
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 0.8
Reporter: John Fung
This issue happens inconsistently but could be reproduced by following the
steps below (repeat step 4 a few times to reproduce it):
1. Check out 0.8 branch (currently reproducible with rev. 1352634)
2. Apply kafka-306-v4.patch
3. Please note that the log.file.size is set to 10000000 in
system_test/broker_failure/config/server_*.properties (small enough to trigger
multi segment files)
4. Under the directory <kafka home>/system_test/broker_failure, execute command:
$ bin/run-test.sh 20 0
5. After the test is completed, the result will probably look like the
following:
========================================================
no. of messages published : 14000
producer unique msg rec'd : 14000
source consumer msg rec'd : 7271
source consumer unique msg rec'd : 7271
mirror consumer msg rec'd : 6960
mirror consumer unique msg rec'd : 6960
total source/mirror duplicate msg : 0
source/mirror uniq msg count diff : 311
========================================================
6. By checking the kafka log files, the sum of the sizes of the source cluster
segments files are equal to those in the target cluster.
[/tmp] $ find kafka* -name *.kafka -ls
18620155 9860 -rw-r--r-- 1 jfung eng 10096535 Jun 21 11:09
kafka-source3-logs/test01-0/00000000000000000000.kafka
18620161 9772 -rw-r--r-- 1 jfung eng 10004418 Jun 21 11:11
kafka-source3-logs/test01-0/00000000000020105286.kafka
18620160 9776 -rw-r--r-- 1 jfung eng 10008751 Jun 21 11:10
kafka-source3-logs/test01-0/00000000000010096535.kafka
18620162 4708 -rw-r--r-- 1 jfung eng 4819067 Jun 21 11:11
kafka-source3-logs/test01-0/00000000000030109704.kafka
19406431 9920 -rw-r--r-- 1 jfung eng 10157685 Jun 21 11:10
kafka-target2-logs/test01-0/00000000000010335039.kafka
19406429 10096 -rw-r--r-- 1 jfung eng 10335039 Jun 21 11:09
kafka-target2-logs/test01-0/00000000000000000000.kafka
19406432 10300 -rw-r--r-- 1 jfung eng 10544850 Jun 21 11:11
kafka-target2-logs/test01-0/00000000000020492724.kafka
19406433 3800 -rw-r--r-- 1 jfung eng 3891197 Jun 21 11:12
kafka-target2-logs/test01-0/00000000000031037574.kafka
7. If the log.file.size in target cluster is configured to a very large value
such that there is only 1 data file, the result would look like this:
========================================================
no. of messages published : 14000
producer unique msg rec'd : 14000
source consumer msg rec'd : 7302
source consumer unique msg rec'd : 7302
mirror consumer msg rec'd : 13750
mirror consumer unique msg rec'd : 13750
total source/mirror duplicate msg : 0
source/mirror uniq msg count diff : -6448
========================================================
8. The log files are like these:
[/tmp] $ find kafka* -name *.kafka -ls
18620160 9840 -rw-r--r-- 1 jfung eng 10075058 Jun 21 11:24
kafka-source2-logs/test01-0/00000000000010083679.kafka
18620155 9848 -rw-r--r-- 1 jfung eng 10083679 Jun 21 11:23
kafka-source2-logs/test01-0/00000000000000000000.kafka
18620162 4484 -rw-r--r-- 1 jfung eng 4589474 Jun 21 11:26
kafka-source2-logs/test01-0/00000000000030269045.kafka
18620161 9876 -rw-r--r-- 1 jfung eng 10110308 Jun 21 11:25
kafka-source2-logs/test01-0/00000000000020158737.kafka
19406429 34048 -rw-r--r-- 1 jfung eng 34858519 Jun 21 11:26
kafka-target3-logs/test01-0/00000000000000000000.kafka
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira