Hi,

I'm having trouble sending produce requests to a Kafka 0.9.0.0 broker
cluster consisting of 4 brokers with IDs 0, 1, 2, and 3.  All 4
brokers are running locally on my CentOS 7 development box, listening
on ports 9092, 9093, 9094, and 9095 respectively.  I am running my
own producer code (Bruce, see https://github.com/ifwe/bruce), which
works without problems with Kafka 0.8, but has problems with 0.9.0.0.
When I send a produce request consisting of a single message, I often
get a response consisting of error ACK -1 (Unknown, unexpected server
error) although I have also seen other errors such as 6
(NotLeaderForPartition).  During one observed instance of this
behavior I saw the following:

    sent produce request
    --------------------
    - single message set containing a single message
    - topic: "topic_1"
    - partition: 3
    - empty key
    - value: "hello world"
    - API key: 0 (produce request)
    - API version: 0
    - correlation ID: 1
    - empty client ID
    - required ACKs: 1
    - timeout: 10000
    - message magic byte: 0
    - message attributes: 0 (no compression)
    - destination broker ID: 3

    received produce response
    -------------------------
    - correlation ID: 1
    - topic: "topic_1"
    - partition: 3
    - error code: -1

I have included tcpdump output for the above request and response
below.  The topic/partition layout is as follows:

    $ bin/kafka-topics.sh --describe --zookeeper localhost:2181/kafka-0.9.0.0
    Topic:topic_1       PartitionCount:4        ReplicationFactor:2     Configs:
        Topic: topic_1  Partition: 0    Leader: 0       Replicas: 0,1   Isr: 0,1
        Topic: topic_1  Partition: 1    Leader: 1       Replicas: 1,2   Isr: 2,1
        Topic: topic_1  Partition: 2    Leader: 2       Replicas: 2,3   Isr: 2,3
        Topic: topic_1  Partition: 3    Leader: 3       Replicas: 3,0   Isr: 0,3
    $

Does anyone have any ideas about what may be causing this behavior?

Additionally, I was using tcpdump to watch communication between
kafka-console-producer.sh and the broker cluster, and noticed that
the console producer sends a value of 1 for the API version in its
produce requests.  Is there any documentation describing protocol
changes between API versions 0 and 1?  I looked here:

    
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

and didn't see anything about the new API version.  Any help is much
appreciated.

Thanks,
Dave


Produce request...

    # tcpdump -i lo -X -s 0 -nn src port 51147
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
    23:03:44.947589 IP 127.0.0.1.51147 > 127.0.0.1.9095: Flags [P.], seq 
2040824198:2040824280, ack 1758705421, win 342, options [nop,nop,TS val 
71321820 ecr 71222139], length 82
        0x0000:  4500 0086 4300 4000 4006 f96f 7f00 0001  E...C.@.@..o....
        0x0010:  7f00 0001 c7cb 2387 79a4 8186 68d3 b70d  ......#.y...h...
        0x0020:  8018 0156 fe7a 0000 0101 080a 0440 48dc  ...V.z.......@H.
        0x0030:  043e c37b 0000 004e 0000 0000 0000 0001  .>.{...N........
        0x0040:  ffff 0001 0000 2710 0000 0001 0007 746f  ......'.......to
        0x0050:  7069 635f 3100 0000 0100 0000 0300 0000  pic_1...........
        0x0060:  2500 0000 0000 0000 0000 0000 1973 acf7  %............s..
        0x0070:  7c00 00ff ffff ff00 0000 0b68 656c 6c6f  |..........hello
        0x0080:  2077 6f72 6c64                           .world
    23:03:44.956525 IP 127.0.0.1.51147 > 127.0.0.1.9095: Flags [.], ack 40, win 
342, options [nop,nop,TS val 71321830 ecr 71321830], length 0
        0x0000:  4500 0034 4301 4000 4006 f9c0 7f00 0001  E..4C.@.@.......
        0x0010:  7f00 0001 c7cb 2387 79a4 81d8 68d3 b734  ......#.y...h..4
        0x0020:  8010 0156 fe28 0000 0101 080a 0440 48e6  ...V.(.......@H.
        0x0030:  0440 48e6                                .@H.
    ^C
    2 packets captured
    4 packets received by filter
    0 packets dropped by kernel
    #

Produce response...

    # tcpdump -i lo -X -s 0 -nn dst port 51147
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
    23:03:44.947661 IP 127.0.0.1.9095 > 127.0.0.1.51147: Flags [.], ack 
2040824280, win 32742, options [nop,nop,TS val 71321821 ecr 71321820], length 0
        0x0000:  4500 0034 7476 4000 4006 c84b 7f00 0001  E..4tv@.@..K....
        0x0010:  7f00 0001 2387 c7cb 68d3 b70d 79a4 81d8  ....#...h...y...
        0x0020:  8010 7fe6 fe28 0000 0101 080a 0440 48dd  .....(.......@H.
        0x0030:  0440 48dc                                .@H.
    23:03:44.956499 IP 127.0.0.1.9095 > 127.0.0.1.51147: Flags [P.], seq 0:39, 
ack 1, win 32742, options [nop,nop,TS val 71321830 ecr 71321820], length 39
        0x0000:  4500 005b 7477 4000 4006 c823 7f00 0001  E..[tw@.@..#....
        0x0010:  7f00 0001 2387 c7cb 68d3 b70d 79a4 81d8  ....#...h...y...
        0x0020:  8018 7fe6 fe4f 0000 0101 080a 0440 48e6  .....O.......@H.
        0x0030:  0440 48dc 0000 0023 0000 0001 0000 0001  .@H....#........
        0x0040:  0007 746f 7069 635f 3100 0000 0100 0000  ..topic_1.......
        0x0050:  03ff ffff ffff ffff ffff ff              ...........
    ^C
    2 packets captured
    4 packets received by filter
    0 packets dropped by kernel
    #


Reply via email to