Hello Nitay,

Thanks for letting us know about your observations. When you restart the
application from empty local state Kafka Streams will try to restore all
the records up to the current log end for those global KTables, and during
this period of time there should be no processing.

Do you mind sharing where you get the "totalRecordsToBeRestored", is it
before the restarting was triggered?

Guozhang

On Mon, Nov 23, 2020 at 4:15 AM Nitay Kufert <nita...@ironsrc.com> wrote:

> Hey all,
> We have been running a kafka-stream based service in production for the
> last couple of years (we have 4 brokers on this specific cluster).
> Inside this service, which does a lot of stuff - there are 2 GlobalKTables
> (which are based on 2 compacted topics).
>
> When I restart my app and clean the local state - the restoration of those
> topics begins, and the weird thing I am noticing is that for *much *fewer
> messages, 1 topic takes *a lot* more time to complete.
>
> Let's call the compacted topics A and B, both are compacted and mainly
> configured the same:
> Replication 3
> Number of Partitions 36
>
> max.message.bytes 25000000
> segment.index.bytes 2097152
> min.cleanable.dirty.ratio 0.1
> cleanup.policy compact
> delete.retention.ms 900000
> segment.bytes 52428800
> segment.ms 3600000
> *WIth the exception, that for topic B, we use *
> cleanup.policy compact,delete
> retention.ms 604800000
> *The behaviour i've noticed for a single partition:*
> Topic A has 62,552 totalRecordsToBeRestored - and it takes around 20s
> Topic B has 24,730,506 totalRecordsToBeRestored - and it takes around 1s
>
> It is worth mentioning that the data that B holds for each record *is much
> much bigger *(A holds an integer while B holds a big object)*.*
>
> Now, I get the feeling that the reason is because the data in B is always
> relatively "fresh", while the data in topic A is mostly stale (the business
> logic behind it suggests that topic A updates at a very low rate - and a
> lot of keys would never be updated)
> So, for example, it's probably holding keys that haven't been updated since
> 2018.
> Topic B keeps getting updated every couple of milliseconds.
>
> Another difference that I just realized i might share is that topic A is
> being "joined" by 6 other streams, while topic B is only being joined by 2.
>
> I find it hard to explain the relation between keeping "old" records and
> the huge difference in number of records and their size.
>
> So I guess I am missing some basic concept when it comes to understanding
> compacted topics and the way the broker saves and fetches the data OR maybe
> we have some underlying problem which can explain it.
>
> Let me know if you need some more info
>
> Thanks!
>
> --
>
> Nitay Kufert
> Backend Team Leader
> [image: ironSource] <http://www.ironsrc.com>
>
> email nita...@ironsrc.com
> mobile +972-54-5480021
> fax +972-77-5448273
> skype nitay.kufert.ssa
> 121 Menachem Begin St., Tel Aviv, Israel
> ironsrc.com <http://www.ironsrc.com>
> [image: linkedin] <https://www.linkedin.com/company/ironsource> [image:
> twitter] <https://twitter.com/ironsource> [image: facebook]
> <https://www.facebook.com/ironSource> [image: googleplus]
> <https://plus.google.com/+ironsrc>
> This email (including any attachments) is for the sole use of the intended
> recipient and may contain confidential information which may be protected
> by legal privilege. If you are not the intended recipient, or the employee
> or agent responsible for delivering it to the intended recipient, you are
> hereby notified that any use, dissemination, distribution or copying of
> this communication and/or its content is strictly prohibited. If you are
> not the intended recipient, please immediately notify us by reply email or
> by telephone, delete this email and destroy any copies. Thank you.
>


-- 
-- Guozhang

Reply via email to