Hi Everyone,

Recently I came across Kafka setup where two data centers are close to each
other, but the company could not find a suitable place for the third one.
As a result third DC is little further, lower network throughput, but still
within range of decent network latency, qualifying for stretch cluster. Let
us assume that client applications are being deployed only on two "primary"
DCs. My idea was to minimize network traffic between DC3 and other data
centers (ideally only to replication).

For Kafka consumer, we can configure rack-awareness, so that consumers will
read data from closest replica (replica.selector.class).
Kafka producers have to send data to partition leaders. There is no way to
tell that we prefer replica leaders to be running in DC1 and DC2. Kafka
will also try to evenly balance leaders across brokers
(auto.leader.rebalance.enable).

Does it sound like a good feature to make the choice of partition leaders
pluggable? Basically, users would be given list of topic-partitions with
ISRs and rack they are running, and could reshuffle them according to
custom logic.

Comments appreciated.

Kind regards,
Lukasz

Reply via email to