John Fung created KAFKA-372: ------------------------------- Summary: Consumer doesn't receive all data if there are multiple segment files Key: KAFKA-372 URL: https://issues.apache.org/jira/browse/KAFKA-372 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: John Fung
This issue happens inconsistently but could be reproduced by following the steps below (repeat step 4 a few times to reproduce it): 1. Check out 0.8 branch (currently reproducible with rev. 1352634) 2. Apply kafka-306-v4.patch 3. Please note that the log.file.size is set to 10000000 in system_test/broker_failure/config/server_*.properties (small enough to trigger multi segment files) 4. Under the directory <kafka home>/system_test/broker_failure, execute command: $ bin/run-test.sh 20 0 5. After the test is completed, the result will probably look like the following: ======================================================== no. of messages published : 14000 producer unique msg rec'd : 14000 source consumer msg rec'd : 7271 source consumer unique msg rec'd : 7271 mirror consumer msg rec'd : 6960 mirror consumer unique msg rec'd : 6960 total source/mirror duplicate msg : 0 source/mirror uniq msg count diff : 311 ======================================================== 6. By checking the kafka log files, the sum of the sizes of the source cluster segments files are equal to those in the target cluster. [/tmp] $ find kafka* -name *.kafka -ls 18620155 9860 -rw-r--r-- 1 jfung eng 10096535 Jun 21 11:09 kafka-source3-logs/test01-0/00000000000000000000.kafka 18620161 9772 -rw-r--r-- 1 jfung eng 10004418 Jun 21 11:11 kafka-source3-logs/test01-0/00000000000020105286.kafka 18620160 9776 -rw-r--r-- 1 jfung eng 10008751 Jun 21 11:10 kafka-source3-logs/test01-0/00000000000010096535.kafka 18620162 4708 -rw-r--r-- 1 jfung eng 4819067 Jun 21 11:11 kafka-source3-logs/test01-0/00000000000030109704.kafka 19406431 9920 -rw-r--r-- 1 jfung eng 10157685 Jun 21 11:10 kafka-target2-logs/test01-0/00000000000010335039.kafka 19406429 10096 -rw-r--r-- 1 jfung eng 10335039 Jun 21 11:09 kafka-target2-logs/test01-0/00000000000000000000.kafka 19406432 10300 -rw-r--r-- 1 jfung eng 10544850 Jun 21 11:11 kafka-target2-logs/test01-0/00000000000020492724.kafka 19406433 3800 -rw-r--r-- 1 jfung eng 3891197 Jun 21 11:12 kafka-target2-logs/test01-0/00000000000031037574.kafka 7. If the log.file.size in target cluster is configured to a very large value such that there is only 1 data file, the result would look like this: ======================================================== no. of messages published : 14000 producer unique msg rec'd : 14000 source consumer msg rec'd : 7302 source consumer unique msg rec'd : 7302 mirror consumer msg rec'd : 13750 mirror consumer unique msg rec'd : 13750 total source/mirror duplicate msg : 0 source/mirror uniq msg count diff : -6448 ======================================================== 8. The log files are like these: [/tmp] $ find kafka* -name *.kafka -ls 18620160 9840 -rw-r--r-- 1 jfung eng 10075058 Jun 21 11:24 kafka-source2-logs/test01-0/00000000000010083679.kafka 18620155 9848 -rw-r--r-- 1 jfung eng 10083679 Jun 21 11:23 kafka-source2-logs/test01-0/00000000000000000000.kafka 18620162 4484 -rw-r--r-- 1 jfung eng 4589474 Jun 21 11:26 kafka-source2-logs/test01-0/00000000000030269045.kafka 18620161 9876 -rw-r--r-- 1 jfung eng 10110308 Jun 21 11:25 kafka-source2-logs/test01-0/00000000000020158737.kafka 19406429 34048 -rw-r--r-- 1 jfung eng 34858519 Jun 21 11:26 kafka-target3-logs/test01-0/00000000000000000000.kafka -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira