Recording - Storm & Kafka Meetup on April 20th 2017

2017-04-21 Thread Roshan Naik
Louro (Hortonworks) - [20m] – Rethinking the Storm 2.0 Worker - Roshan Naik (Hortonworks) - [57m] – Storm in Retail Context: Catalog data processing using Kafka, Storm & Microservices - Karthik Deivasigamani (WalMart Labs) - [1h: 54m:45sec] – Schema Regi

Duplication when switching to preferred leader

2016-02-25 Thread Roshan Naik
Here is a case of data duplication that should be avoidable. It is observed when leadership of partition changes from the current leader back to preferred leader. Steps to reproduce: - Using 3 broker setup. - Create topic with 1 partition, replication factor=3, ISR count=2 and

Re: [DISCUSS] KIP-26 - Add Copycat, a connector framework for data import/export

2015-06-22 Thread Roshan Naik
Thanks Jay and Ewen for the response. @Jay 3. This has a built in notion of parallelism throughout. It was not obvious how it will look like or differ from existing systemsŠ since all of existing ones do parallelize data movement. @Ewen, Import: Flume is just one of many similar systems

Re: [DISCUSS] KIP-26 - Add Copycat, a connector framework for data import/export

2015-06-19 Thread Roshan Naik
My initial thoughts: Although it is kind of discussed very broadly, I did struggle a bit to properly grasp the value add this adds over the alternative approaches that are available today (or need a little work to accomplish) in specific use cases. I feel its better to take specific common

Re: Perf testing flush() - issues found

2015-04-29 Thread Roshan Naik
For some reason the HTML formatting is being dropped from my email.. Making it harder to read the measurements table. On 4/29/15 8:32 PM, Roshan Naik ros...@hortonworks.com wrote: @Jay, My bad. I mistook the batch.size to be number of messages instead of bytes. Below are revised measurements

Re: Perf testing flush() - issues found

2015-04-29 Thread Roshan Naik
@Jay, My bad. I mistook the batch.size to be number of messages instead of bytes. Below are revised measurements based on computing the batch.size in bytes . @Jun, With explicit flush()... linger should not impact. Isn't it ? @Wang, Larger batches are not necessarily giving better

Perf testing flush() - issues found

2015-04-28 Thread Roshan Naik
Based on recent suggestion by Joel, I am experimenting with using flush() to simulate batched-sync behavior. The essence of my single threaded producer code is : for (int i = 0; i numRecords;) { // 1- Send a batch for(int batchCounter=0; batchCounterbatchSz;

Re: Perf testing flush() - issues found

2015-04-28 Thread Roshan Naik
:58:43AM +, Roshan Naik wrote: Based on recent suggestion by Joel, I am experimenting with using flush() to simulate batched-sync behavior. The essence of my single threaded producer code is : for (int i = 0; i numRecords;) { // 1- Send a batch for(int batchCounter=0