Cool! Thank you Matthias!
On Sun, 25 Aug 2019 at 15:11, Matthias J. Sax <matth...@confluent.io> wrote: > You cannot delete arbitrary data, however, it's possible to send a > "truncate request" to brokers, to delete data before the retention time > is reached: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-107%3A+Add+deleteRecordsBefore%28%29+API+in+AdminClient > > There is `AdminClient#deleteRecords(...)` API to do so. > > > -Matthias > > On 8/21/19 9:09 PM, Murilo Tavares wrote: > > Thanks Matthias for the prompt response. > > Now just for curiosity, how does that work? I thought it was not possible > > to easily delete topic data... > > > > > > On Wed, Aug 21, 2019 at 4:51 PM Matthias J. Sax <matth...@confluent.io> > > wrote: > > > >> No need to worry about this. > >> > >> Kafka Streams used "purge data" calls, to actively delete data from > >> those topics after the records are processed. Hence, those topics won't > >> grow unbounded but are "truncated" on a regular basis. > >> > >> > >> -Matthias > >> > >> On 8/21/19 11:38 AM, Murilo Tavares wrote: > >>> Hi > >>> I have a complex KafkaStreams topology, where I have a bunch of KTables > >>> that I regroup (rekeying) and aggregate so I can join them. > >>> I've noticed that the "-repartition" topics created by the groupBy > >>> operations have a very long retention by default (Long.MAX_VALUE). > >>> I'm a bit concerned about the size of these topics, as they will retain > >>> data forever. I wonder why are they so long, and what would be the > impact > >>> of reducing this retention? > >>> Thanks > >>> Murilo > >>> > >> > >> > > > >