Ok, I think I found the cause of the problem. The default value of max_in_flight_requests_per_connection is 5 in Python producer. This turns out too small for my environment and application. When this value is reached and the producer tries several times and still fails, the message is dropped. And since send() is asynchronous, there is no way for the producer to report the failure immediately. So one solution is to block for 'synchronous' sends as described at http://kafka-python.readthedocs.io/en/1.0.0/usage.html. But this slows down the producing speed. Another solution, which I end up with, is to significantly increase max_in_flight_requests_per_connection, e.g., to 1024.
On Thu, Apr 28, 2016 at 10:31 AM, Bo Xu <box...@gmail.com> wrote: > PS: The message dropping occurred intermittently, not all at the end. For > example, it is the 10th, 15th, 18th messages that are missing. It it were > all at the end, it would be understandable because I'm not using flush() to > force transmitting. > > Bo > > > On Thu, Apr 28, 2016 at 10:15 AM, Bo Xu <box...@gmail.com> wrote: > >> I set up a simple Kafka configuration, with one topic and one partition. >> I have a Python producer to continuously publish messages to the Kafka >> server and a Python consumer to receive messages from the server. Each >> message is about 10K bytes, far smaller than >> socket.request.max.bytes=104857600. What I found is that the consumer >> intermittently missed some messages. I checked the server log file and >> found that these messages are missing there as well. So it looks like these >> message were never stored by the server. I also made sure that the producer >> did not receive any error for every message that it published (using >> send()). >> >> Any clues what could have caused the problem? >> >> Thanks, >> Bo >> > >