Re: Large state directory with Kafka Streams

2017-02-27 Thread Eno Thereska
Hi Ian, As discussed in KAFKA-3775, proper memory management is better than throttling and we've made some steps towards that in 0.10.1 and 0.10.2 (reduce the memory RocksDb uses, provide a global memory limit for buffers within streams). The scenario you mention is possible, and needs to be

Re: Large state directory with Kafka Streams

2017-02-27 Thread Ian Duffy
> Yes, the partitions reflect those of the input topic. You could try to create the topic manually before streams start, however, that might not be an ideal operational way of doing things (it's best if streams continues to do these things automatically). I'd suggest the scaling out approach

Re: Large state directory with Kafka Streams

2017-02-27 Thread Eno Thereska
Hi Ian, Yes, the partitions reflect those of the input topic. You could try to create the topic manually before streams start, however, that might not be an ideal operational way of doing things (it's best if streams continues to do these things automatically). I'd suggest the scaling out

Re: Large state directory with Kafka Streams

2017-02-27 Thread Ian Duffy
Hi Eno, Thanks for the fast response. > It looks like you have a lot of partitions for the count store. I believe this isn't configurable? They were auto created by the stream. I'm assuming its mirrored based of the amount off partitions our input topic has. > The locking part was supposed to

Re: Large state directory with Kafka Streams

2017-02-27 Thread Eno Thereska
Hi Ian, It looks like you have a lot of partitions for the count store. Each RocksDb database uses off heap memory (around 60-70MB in 0.10.2) which will add up if you have these many stores in one instance. One solution would be to scale out your streams application by using another Kafka

Large state directory with Kafka Streams

2017-02-27 Thread Ian Duffy
Hi All, I'm using Kafka Client 10.2 with Kafka Streams. I'm performing a groupByKey on a stream and seeing large files appear within my state directory. Is this expected? 90M 1_0/rocksdb/content-count-store 82M 1_1/rocksdb/content-count-store 102M 1_10/rocksdb/content-count-store 86M