Hi Johnathan.
Yes I decreased the retention on all topics simultaneously. I realized my
mistake later when I saw the cluster overloaded :)
I wasn't 100% sure so I looked it up, but it looks to me like
log.cleaner.threads and log.cleaner.io.max.bytes.per.second only apply when a
topic is using
Howdy Vincent.
Sounds like a painful situation! I have experienced similar drama with
Kafka so maybe I can offer some advice.
You said you decreased the retention time on 4 topics. I wonder, was this
done on all 4 topics at the same time?
Depending on broker and partition config, that can be ver
Hi,
I'm wondering if there is a way to tell Kafka to spread the log file
deletion when decreasing the retention time of a topic, and if not, if
it would make sense.
I'm asking because this afternoon, after decreasing the retention time
from 2 months to 1 month on 4 of my topics, the whole cluster
Hi Martin,
That is a good point. In fact in the coming release we have made such
repartition topics really "transient" by periodically purging it with the
embedded admin client, so we can actually set its retention to -1:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-220%3A+Add+AdminClien
Hi Guozhang,
Thanks very much for your reply. I am inclined to consider this a bug, since
Kafka Streams in the default configuration is likely to run into this problem
while reprocessing old messages, and in most cases the problem wouldn't be
noticed (since there is no error -- the job just pro
Hello Martin,
What you've observed is correct. More generally speaking, for various
broker-side operations that based on record timestamps and treating them as
wall-clock time, there is a mismatch between the stream records' timestamp
which is basically "event time", against the broker's system wa
Follow-up: I think we figured out what was happening. Setting the broker config
log.message.timestamp.type=LogAppendTime (instead of the default value
CreateTime) stopped the messages disappearing.
The messages in the Streams app's input topic are older than the 24 hours
default retention perio
Hi all,
We are debugging an issue with a Kafka Streams application that is producing
incorrect output. The application is a simple group-by on a key, and then
count. As expected, the application creates a repartitioning topic for the
group-by stage. The problem appears to be that messages are g