Bundling already-serialized messages

2018-08-19 Thread John Walker
I want to do something crazy. Someone talk me off the ledge: record Bundle { string key; array msgs; } Producers individually serialize a bunch of messages that share a key, then serialize a bundle and post to a topic. A generic Flattener service is configured by startup parameters to liste

Re: kafka stream latency

2018-08-19 Thread Guozhang Wang
Hello Nan, Note that Streams may need some time to rebalance and assign tasks even if you only starts with one instance. I'd suggest you register your state listener in Kafka Streams via KafkaStreams#setStateListener, and your customized StateListener should record when the state transits from RE

Re: Kafka streams - runs out of memory

2018-08-19 Thread AshokKumar J
Hi Guozhang, Please find below. I have tried with the latest 2.0.0 libraries and no improvement observed. Kafka version - 1.0.1 Total Memory allocated - 24 GB Max Stream Cache - 8GB --- Processor class code: private KeyValueStore hourlyStore = null; // Loca

Re: kafka stream latency

2018-08-19 Thread Nan Xu
right, so my kafka cluster is already up and running for a while, and I can see from the log all broker instance already change from rebalance to running. I did a another test. from producer, right before the message get send to the broker, I put a timestamp in the message. and from the consumer s

Re: kafka stream latency

2018-08-19 Thread Guozhang Wang
Okay, so you're measuring end-to-end time from producer -> broker -> streams' consumer client, there are multiple phases that can contribute to the 100ms latency, and I cannot tell if stream's consumer phase is the major contributor. For example, if the topic was not created before, then when the b

Re: Kafka issue

2018-08-19 Thread Shantanu Deshmukh
How many brokers are there in your cluster? This error usually comes when one of the brokers who is leader for a partition dies and you are trying to access it. On Fri, Aug 17, 2018 at 9:23 PM Harish K wrote: > Hi, >I have installed Kafka and created topic but while data ingestion i get > so

Re: NetworkException exception while send/publishing records(Producer)

2018-08-19 Thread Shantanu Deshmukh
Firstly, record size of 150mb is too big. I am quite sure your timeout exceptions are due to such a large record. There is a setting in producer and broker config which allows you to specify max message size in bytes. But still records each of size 150mb might lead to problems with increasing volum

Re: kafka stream latency

2018-08-19 Thread Nan Xu
did more test and and make the test case simple. all the setup now is a single physical machine. running 3 docker instance. a1, a2, a3 kafka + zookeeper running on all of those docker containers. producer running on a1, send a single key, update speed 2000 message/s, each message is 10K size. 3 c

Re: Kafka issue

2018-08-19 Thread Nan Xu
I did several test. one is with 10 brokers (remote server), one with 3 brokers. (local docker) both exhibit the same behavior, I was thinking the same but from at least the kafka log, I don't see a rebalance happening. and I am sure my cpu is only used about half. and all broker still running.

Re: Kafka issue

2018-08-19 Thread Nan Xu
maybe I should highlight, I only publish 1 key. so only one broker is going to handle it. and only 1 stream instance handle it too. what's the typical throughput/latency I should expect in this case? assuming the processing logic is very very simple, just get data(integer) and sum. I am more expect