[
https://issues.apache.org/jira/browse/KAFKA-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Surana updated KAFKA-6718:
---------------------------------
Description:
|Machines in data centre are sometimes grouped in racks. Racks provide
isolation as each rack may be in a different physical location and has its own
power source. When tasks are properly replicated across racks, it provides
fault tolerance in that if a rack goes down, the remaining racks can continue
to serve traffic.
This feature is already implemented at Kafka
[KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment]
but we needed similar for task assignments at Kafka Streams Application layer.
This features enables replica tasks to be assigned on different racks for
fault-tolerance.
NUM_STANDBY_REPLICAS = x
totalTasks = x+1 (replica + active)
# If there are no rackID provided: Cluster will behave rack-unaware
# If same rackId is given to all the nodes: Cluster will behave rack-unaware
# If (totalTasks <= number of racks), then Cluster will be rack aware i.e.
each replica task is each assigned to a different rack.
# Id (totalTasks > number of racks), then it will first assign tasks on
different racks, further tasks will be assigned to least loaded node, cluster
wide.|
We have added another config in StreamsConfig called "RACK_ID_CONFIG" which
helps StickyPartitionAssignor to assign tasks in such a way that no two replica
tasks are on same rack if possible.
Post that it also helps to maintain stickyness with-in the rack.|
was:
|Machines in data centre are sometimes grouped in racks. Racks provide
isolation as each rack may be in a different physical location and has its own
power source. When tasks are properly replicated across racks, it provides
fault tolerance in that if a rack goes down, the remaining racks can continue
to serve traffic.
This feature is already implemented at Kafka
[KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment]
but we needed similar for task assignments at Kafka Streams Application layer.
This features enables replica tasks to be assigned on different racks for
fault-tolerance.
NUM_STANDBY_REPLICAS = x
totalTasks = x+1 (replica + active)
# If there are no rackID provided: Cluster will behave rack-unaware
# If same rackId is given to all the nodes: Cluster will behave rack-unaware
# If (totalTasks >= number of racks), then Cluster will be rack aware i.e.
each replica task is each assigned to a different rack.
# Id (totalTasks < number of racks), then it will first assign tasks on
different racks, further tasks will be assigned to least loaded node, cluster
wide.|
We have added another config in StreamsConfig called "RACK_ID_CONFIG" which
helps StickyPartitionAssignor to assign tasks in such a way that no two replica
tasks are on same rack if possible.
Post that it also helps to maintain stickyness with-in the rack.|
> Rack Aware Replica Task Assignment for Kafka Streams
> ----------------------------------------------------
>
> Key: KAFKA-6718
> URL: https://issues.apache.org/jira/browse/KAFKA-6718
> Project: Kafka
> Issue Type: New Feature
> Components: streams
> Affects Versions: 1.1.0
> Reporter: Deepak Goyal
> Assignee: Deepak Goyal
> Priority: Major
> Labels: needs-kip
>
> |Machines in data centre are sometimes grouped in racks. Racks provide
> isolation as each rack may be in a different physical location and has its
> own power source. When tasks are properly replicated across racks, it
> provides fault tolerance in that if a rack goes down, the remaining racks can
> continue to serve traffic.
>
> This feature is already implemented at Kafka
> [KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment]
> but we needed similar for task assignments at Kafka Streams Application
> layer.
>
> This features enables replica tasks to be assigned on different racks for
> fault-tolerance.
> NUM_STANDBY_REPLICAS = x
> totalTasks = x+1 (replica + active)
> # If there are no rackID provided: Cluster will behave rack-unaware
> # If same rackId is given to all the nodes: Cluster will behave rack-unaware
> # If (totalTasks <= number of racks), then Cluster will be rack aware i.e.
> each replica task is each assigned to a different rack.
> # Id (totalTasks > number of racks), then it will first assign tasks on
> different racks, further tasks will be assigned to least loaded node, cluster
> wide.|
> We have added another config in StreamsConfig called "RACK_ID_CONFIG" which
> helps StickyPartitionAssignor to assign tasks in such a way that no two
> replica tasks are on same rack if possible.
> Post that it also helps to maintain stickyness with-in the rack.|
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)