Re: Kafka topic retention question

2018-07-21 Thread Gwen Shapira
Ah, common source of confusion!

Each partition is divided into 1GB segments. And the active segment, the
one you are currently writing into, is never deleted.

So, until you write 1GB, you will see all messages. If you need more
accurate retention, you can configure smaller segment size for low
throughput topics.

Gwen

On Sat, Jul 21, 2018, 9:28 AM David Collette  wrote:

> I have been diving into to Kafka for the last couple of weeks. I am running
> some low volume (5-10 TPS) production data. I had set a default retention
> period  on all the topics for 48 hours. I have just noticed there is some
> behavior I don't understand. I see messages in the topic that are over a
> week old and they don't seem to be removed. Is there some documentation on
> the topic clean up process or guidance on how can I figure out why the
> messages are not being removed?
>
> This is what I am seeing on the topic:
>
> * messages from July 13 - July 15th
> * No messages from July 16th - July 18 (they were there at one point, but
> removed because retention I assume)
> * messages from July 20th and July 21 (this is what I expected with the 48
> hour retention
>
> Any guidance would be much appreciated.
>
>
> Thanks,
> David Collette
> collett...@gmail.com
>


Kafka topic retention question

2018-07-21 Thread David Collette
I have been diving into to Kafka for the last couple of weeks. I am running
some low volume (5-10 TPS) production data. I had set a default retention
period  on all the topics for 48 hours. I have just noticed there is some
behavior I don't understand. I see messages in the topic that are over a
week old and they don't seem to be removed. Is there some documentation on
the topic clean up process or guidance on how can I figure out why the
messages are not being removed?

This is what I am seeing on the topic:

* messages from July 13 - July 15th
* No messages from July 16th - July 18 (they were there at one point, but
removed because retention I assume)
* messages from July 20th and July 21 (this is what I expected with the 48
hour retention

Any guidance would be much appreciated.


Thanks,
David Collette
collett...@gmail.com