Based on recent suggestion by Joel, I am experimenting with using flush() to 
simulate  batched-sync behavior.
The essence of my  single threaded producer code is :

    for (int i = 0; i < numRecords;) {
        // 1- Send a batch
        for(int batchCounter=0; batchCounter<batchSz; ++batchCounter) {
            Future<RecordMetadata> f =  producer.send(record, null);
            futureList.add(f);
            i++;
        }
        // 2- Flush after sending batch
        producer.flush();

        // 3- Ensure all msgs were send
        for( Future<RecordMetadata> f : futureList) {
            f.get();
        }
    }

There are actually two batch size in play here. One is the number of messages 
between every flush() call made by the client. The other is the  batch.size  
setting which impacts the batching internally done by the underlying Async api.

Intuitively  .. we either want to
  A) Set both batch sizes to be Equal, OR
  B) Set the underlying batch.size to a sufficiently large number so as to 
effectively disable internal batch management


Below numbers are in MB/s.  The 'Batch' column indicate the number of events 
between each explicit client flush()
Setup is 1-node broker and acks=1.

                1 partition
                Batch=4k        Batch=8k        Batch=16k
Equal batchSizes (a)    16      32      52
large batch.Size (b)    140     123     124

                4 partitions
                Batch=4k        Batch=8k        Batch=16k
Equal batchSz (a)       35      61      82
large batch.size (b)    7       7       7
                8 partitions
                Batch=4k        Batch=8k        Batch=16k
Equal batchSz (a)               49      70      99
large batch.size (b)    7       8       7


There are two issues noticeable in these number:
1 - Case A is much faster than case B for 4 and 8 partitions.
2 - Single partition mode outperforms all others and here case B is faster than 
case A.




Side Note: I used the  client APIs  from the trunk while the broker is running 
0.8.2 (I don't think it matters, but nevertheless wanted to point out)

Reply via email to