Kafka Streams RocksDB high CPU usage

2020-10-27 Thread Giselle Van Dongen
Hi all, We have a Kafka Streams application which is showing high CPU usage. When profiling the application we see that many of the hotspots are related to RocksDB: flush, seek0, put iteratorCF and get methods. We are using the default configuration for RocksDB. We read the documentation but

Kafka Streams RocksDB CPU usage

2020-10-27 Thread Giselle van Dongen
Hi all, We have a Kafka Streams job which has high CPU utilization. When profiling the job, we saw that this was for a large part due to RocksDB methods: flush, seek, put, get, iteratorCF. We use the default settings for our RocksDB state store. Which configuration parameters are most importan

Re: Partition assignment not well distributed over threads

2020-07-30 Thread Giselle Van Dongen
applications won't be running on the same machine > as the broker. > Clearly it has been difficult enough to optimize for two things at the same > time, stickiness and > balance, without introducing a third :) > > On Wed, Jul 29, 2020 at 4:58 AM Giselle Van Dongen < >

Partition assignment not well distributed over threads

2020-07-29 Thread Giselle Van Dongen
We have a Kafka Streams (2.4) app consisting of 5 instances. It reads from a Kafka topic with 20 partitions (5 brokers). We notice that the partition assignment does not always lead to well distributed load over the different threads. We notice this at startup as well as after a recovery of a

Benchmarking streaming frameworks

2017-03-23 Thread Giselle van Dongen
work at Ghent University. The included frameworks at this time are, in no particular order, Spark, Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to previous work or relevant benchmarks would be appreciated. Best regards, Giselle van Dongen