Where does the "20 minutes" come from? I thought the "aggregate" operator in your
stream->aggregate->filter->foreach topology is not a windowed aggregation, so the aggregate results will keep accumulating. Guozhang On Tue, Nov 29, 2016 at 8:40 PM, Jon Yeargers <jon.yearg...@cedexis.com> wrote: > "keep increasing" - why? It seems (to me) that the aggregates should be 20 > minutes long. After that the memory should be released. > > Not true? > > On Tue, Nov 29, 2016 at 3:53 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > Jon, > > > > Note that in your "aggregate" function, if it is now windowed aggregate > > then the aggregation results will keep increasing in your local state > > stores unless you're pretty sure that the aggregate key space is bounded. > > This is not only related to disk space but also memory since the current > > default persistent state store, RocksDB, takes its own block cache both > on > > heap and out of heap. > > > > In addition, RocksDB has a write / space amplification. That is, if your > > estimate size of the aggregated state store takes 1GB, it will actually > > take 1 * max-amplification-factor in worst case, and similarly for block > > cache and write buffer. > > > > > > Guozhang > > > > On Tue, Nov 29, 2016 at 12:25 PM, Jon Yeargers <jon.yearg...@cedexis.com > > > > wrote: > > > > > App eventually got OOM-killed. Consumed 53G of swap space. > > > > > > Does it require a different GC? Some extra settings for the java cmd > > line? > > > > > > On Tue, Nov 29, 2016 at 12:05 PM, Jon Yeargers < > jon.yearg...@cedexis.com > > > > > > wrote: > > > > > > > I cloned/built 10.2.0-SNAPSHOT > > > > > > > > App hasn't been OOM-killed yet but it's up to 66% mem. > > > > > > > > App takes > 10 min to start now. Needless to say this is problematic. > > > > > > > > The 'kafka-streams' scratch space has consumed 37G and still > climbing. > > > > > > > > > > > > On Tue, Nov 29, 2016 at 10:48 AM, Jon Yeargers < > > jon.yearg...@cedexis.com > > > > > > > > wrote: > > > > > > > >> Does every broker need to be updated or just my client app(s)? > > > >> > > > >> On Tue, Nov 29, 2016 at 10:46 AM, Matthias J. Sax < > > > matth...@confluent.io> > > > >> wrote: > > > >> > > > >>> What version do you use? > > > >>> > > > >>> There is a memory leak in the latest version 0.10.1.0. The bug got > > > >>> already fixed in trunk and 0.10.1 branch. > > > >>> > > > >>> There is already a discussion about a 0.10.1.1 bug fix release. For > > > now, > > > >>> you could build the Kafka Streams from the sources by yourself. > > > >>> > > > >>> -Matthias > > > >>> > > > >>> > > > >>> On 11/29/16 10:30 AM, Jon Yeargers wrote: > > > >>> > My KStreams app seems to be having some memory issues. > > > >>> > > > > >>> > 1. I start it `java -Xmx8G -jar <app>.jar` > > > >>> > > > > >>> > 2. Wait 5-10 minutes - see lots of 'org.apache.zookeeper. > > ClientCnxn > > > - > > > >>> Got > > > >>> > ping response for sessionid: 0xc58abee3e000013 after 0ms' > messages > > > >>> > > > > >>> > 3. When it _finally_ starts reading values it typically goes for > a > > > >>> minute > > > >>> > or so, reads a few thousand values and then the OS kills it with > > 'Out > > > >>> of > > > >>> > memory' error. > > > >>> > > > > >>> > The topology is (essentially): > > > >>> > > > > >>> > stream->aggregate->filter->foreach > > > >>> > > > > >>> > It's reading values and creating a rolling average. > > > >>> > > > > >>> > During phase 2 (above) I see lots of IO wait and the 'scratch' > > buffer > > > >>> > (usually "/tmp/kafka-streams/appname") fills with 10s of Gb of > .. ? > > > (I > > > >>> had > > > >>> > to create a special scratch partition with 100Gb of space as > kafka > > > >>> would > > > >>> > fill the / partition and make the system v v unhappy) > > > >>> > > > > >>> > Have I misconfigured something? > > > >>> > > > > >>> > > > >>> > > > >> > > > > > > > > > > > > > > > -- > > -- Guozhang > > > -- -- Guozhang