Hi Ryanne

I have a quick question for you about Active+Active replication and Kafka
Streams. First, does your org /do you use Kafka Streams? If not then I
think this conversation can end here. ;)

Secondly, and for the broader Kafka Dev group - what happens if I want to
use Active+Active replication with my Kafka Streams app, say, to
materialize a simple KTable? Based on my understanding, I topic "table" on
the primary cluster will be replicated to the secondary cluster as
"primary.table". In the case of a full cluster failure for primary, the
producer to topic "table" on the primary switches over to the secondary
cluster, creates its own "table" topic and continues to write to there. So
now, assuming we have had no data loss, we end up with:


*Primary Cluster: (Dead)*


*Secondary Cluster: (Live)*
Topic: "primary.table" (contains data from T = 0 to T = n)
Topic: "table" (contains data from T = n+1 to now)

If I want to materialize state from using Kafka Streams, obviously I am now
in a bit of a pickle since I need to consume "primary.table" before I
consume "table". Have you encountered rebuilding state in Kafka Streams
using Active-Active? For non-Kafka Streams I can see using a single
consumer for "primary.table" and one for "table", interleaving the
timestamps and performing basic event dispatching based on my own tracked
stream-time, but for Kafka Streams I don't think there exists a solution to
this.

If you have any thoughts on this or some recommendations for Kafka Streams
with Active-Active I would be very appreciative.

Thanks
Adam

Reply via email to