Hey Anh,

For a simple read/write StreamTask that has little logic in it, you should
be able to get 10,000+ messages/sec per-container with a 1kb msg payload
when talking to a remote Kafka broker.


At first glance, setting a batch size of 1 with a sync producer will
definitely slow down your task, especially if num.acks is set to a number
other than zero.

Could you please post your job config file, and your code (if that's OK)?


Cheers,
Chris

On 3/28/14 8:00 AM, "Anh Thu Vu" <[email protected]> wrote:

>I forgot to clarify this. My application is a simple pipeline of 2 jobs:
>The first one reads from a file and write to a kafka topic.
>The second reads from that kafka topic.
>
>The measured throughput is done in the second job (get timestamp when
>receive the 1st, 1000th,... message)
>
>Casey
>
>
>On Fri, Mar 28, 2014 at 3:56 PM, Anh Thu Vu <[email protected]> wrote:
>
>> Hi guys,
>>
>> I'm running my application on both local and on a small cluster of 5
>>nodes
>> (each with 2GB RAM, 1 core, connected via normal Ethernet - I think) and
>> the observed throughput seems very slow.
>>
>> Do you have any idea about an expected throughput when run with one
>> 7200RPM harddrive?
>> My estimated throughput is about 1000 messages per second. Each message
>>is
>> slightly more than 1kB, kafka batchsize = 1, sync producer.
>>
>> When I try with async producer, with different batchsize, there can be a
>> slight improvement.
>>
>> The config for the job has only the essential properties.
>>
>> Any suggestion? Could I misconfigure something?
>>
>> Casey
>>

Reply via email to