John Fung created KAFKA-372:
-------------------------------

             Summary: Consumer doesn't receive all data if there are multiple 
segment files
                 Key: KAFKA-372
                 URL: https://issues.apache.org/jira/browse/KAFKA-372
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.8
            Reporter: John Fung


This issue happens inconsistently but could be reproduced by following the 
steps below (repeat step 4 a few times to reproduce it):

1. Check out 0.8 branch (currently reproducible with rev. 1352634)

2. Apply kafka-306-v4.patch

3. Please note that the log.file.size is set to 10000000 in 
system_test/broker_failure/config/server_*.properties (small enough to trigger 
multi segment files)

4. Under the directory <kafka home>/system_test/broker_failure, execute command:
$ bin/run-test.sh 20 0

5. After the test is completed, the result will probably look like the 
following:

========================================================
no. of messages published            : 14000
producer unique msg rec'd            : 14000
source consumer msg rec'd            : 7271
source consumer unique msg rec'd     : 7271
mirror consumer msg rec'd            : 6960
mirror consumer unique msg rec'd     : 6960
total source/mirror duplicate msg    : 0
source/mirror uniq msg count diff    : 311
========================================================

6. By checking the kafka log files, the sum of the sizes of the source cluster 
segments files are equal to those in the target cluster.

[/tmp] $  find kafka* -name *.kafka -ls

18620155 9860 -rw-r--r--   1 jfung    eng      10096535 Jun 21 11:09 
kafka-source3-logs/test01-0/00000000000000000000.kafka
18620161 9772 -rw-r--r--   1 jfung    eng      10004418 Jun 21 11:11 
kafka-source3-logs/test01-0/00000000000020105286.kafka
18620160 9776 -rw-r--r--   1 jfung    eng      10008751 Jun 21 11:10 
kafka-source3-logs/test01-0/00000000000010096535.kafka
18620162 4708 -rw-r--r--   1 jfung    eng       4819067 Jun 21 11:11 
kafka-source3-logs/test01-0/00000000000030109704.kafka
19406431 9920 -rw-r--r--   1 jfung    eng      10157685 Jun 21 11:10 
kafka-target2-logs/test01-0/00000000000010335039.kafka
19406429 10096 -rw-r--r--   1 jfung    eng      10335039 Jun 21 11:09 
kafka-target2-logs/test01-0/00000000000000000000.kafka
19406432 10300 -rw-r--r--   1 jfung    eng      10544850 Jun 21 11:11 
kafka-target2-logs/test01-0/00000000000020492724.kafka
19406433 3800 -rw-r--r--   1 jfung    eng       3891197 Jun 21 11:12 
kafka-target2-logs/test01-0/00000000000031037574.kafka

7. If the log.file.size in target cluster is configured to a very large value 
such that there is only 1 data file, the result would look like this:

========================================================
no. of messages published            : 14000
producer unique msg rec'd            : 14000
source consumer msg rec'd            : 7302
source consumer unique msg rec'd     : 7302
mirror consumer msg rec'd            : 13750
mirror consumer unique msg rec'd     : 13750
total source/mirror duplicate msg    : 0
source/mirror uniq msg count diff    : -6448
========================================================

8. The log files are like these:

[/tmp] $ find kafka* -name *.kafka -ls

18620160 9840 -rw-r--r--   1 jfung    eng      10075058 Jun 21 11:24 
kafka-source2-logs/test01-0/00000000000010083679.kafka
18620155 9848 -rw-r--r--   1 jfung    eng      10083679 Jun 21 11:23 
kafka-source2-logs/test01-0/00000000000000000000.kafka
18620162 4484 -rw-r--r--   1 jfung    eng       4589474 Jun 21 11:26 
kafka-source2-logs/test01-0/00000000000030269045.kafka
18620161 9876 -rw-r--r--   1 jfung    eng      10110308 Jun 21 11:25 
kafka-source2-logs/test01-0/00000000000020158737.kafka
19406429 34048 -rw-r--r--   1 jfung    eng      34858519 Jun 21 11:26 
kafka-target3-logs/test01-0/00000000000000000000.kafka


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to