[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290486200


##
docs/streams/developer-guide/config-streams.html:
##
@@ -685,6 +688,45 @@ default.windowed.value.serde.innerThis is discussed in more detail in Data types and serialization.
 
 
+  
+rack.aware.assignment.non_overlap_cost
+
+  
+
+  This configuration sets the cost of moving a task from the 
original assignment computed either by StickyTaskAssignor or
+  HighAvailabilityTaskAssignor. Together with rack.aware.assignment.traffic_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.traffic_cost,
+  the optimizer will try to maintain the existing assignment 
computed by the task assignor. Note that the optimizer takes the ratio of these 
two configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.non_overlap_cost to 10 and 
rack.aware.assignment.traffic_cost to 1 is more 
likely to maintain existing assignment than setting
+  rack.aware.assignment.non_overlap_cost to 100 and 
rack.aware.assignment.traffic_cost to 50.
+
+
+  The default value is null which means default non_overlap_cost 
in different assignors will be used. In StickyTaskAssignor, it has a higher default value 
than rack.aware.assignment.traffic_cost which means
+  maintaining stickiness is preferred in StickyTaskAssignor. In HighAvailabilityTaskAssignor, it has a lower default 
value than rack.aware.assignment.traffic_cost
+  which means minimizing cross rack traffic is preferred in HighAvailabilityTaskAssignor.
+
+  
+
+  
+  
+rack.aware.assignment.strategy
+
+  
+
+  This configuration sets the strategy Kafka Streams can use for 
rack aware task assignment so that cross traffic from broker to client can be 
reduced. This config will only take effect when broker.rack
+  is set on broker side and client.rack is set on Kafka Streams side. There are 
two settings for this config:

Review Comment:
   > is set on broker side
   
   Either "is set on the brokers" or "is set broker side"
   
   Also for `is set on Kafka Streams side`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290445388


##
docs/streams/architecture.html:
##
@@ -167,6 +167,14 @@ Kafka
 Streams Developer Guide section.
 
+
+There is also a client side config client.rack which can 
set the rack for a Kafka Consumer. If broker side also have rack set via 
broker.rack. Then rack aware task
+assignment can be enabled to compute a task assignment which can 
reduce cross rack traffic by try to assign tasks to clients with the same rack. 
See rack.aware.assignment.strategy in
+Kafka
 Streams Developer Guide.
+Note that client.rack can also be used to distribute 
standby tasks on different "rack" from the active ones, which has a similar 
functionality as rack.aware.assignment.tags.

Review Comment:
   ```suggestion
   Note that client.rack can also be used to distribute 
standby tasks to different racks from the active ones, which has a similar 
functionality as rack.aware.assignment.tags.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290489338


##
docs/streams/developer-guide/config-streams.html:
##
@@ -718,6 +760,24 @@ rack.aware.assignment.tags
+rack.aware.assignment.traffic_cost
+
+  
+
+  This configuration sets the cost of cross rack traffic. Together 
with rack.aware.assignment.non_overlap_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.non_overlap_cost,
+  the optimizer will try to compute an assignment which minimize 
the cross rack traffic. Note that the optimizer takes the ratio of these two 
configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.traffic_cost to 10 and rack.aware.assignment.non_overlap_cost to 1 is more 
likely to minimize cross rack traffic than setting
+  rack.aware.assignment.traffic_cost to 100 and rack.aware.assignment.non_overlap_cost to 50.
+
+
+  The default value is null which means default traffic cost in 
different assignors will be used. In StickyTaskAssignor, it has a lower default value than 
rack.aware.assignment.non_overlap_cost.

Review Comment:
   As above: include default value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290445751


##
docs/streams/architecture.html:
##
@@ -167,6 +167,14 @@ Kafka
 Streams Developer Guide section.
 
+
+There is also a client side config client.rack which can 
set the rack for a Kafka Consumer. If broker side also have rack set via 
broker.rack. Then rack aware task
+assignment can be enabled to compute a task assignment which can 
reduce cross rack traffic by try to assign tasks to clients with the same rack. 
See rack.aware.assignment.strategy in
+Kafka
 Streams Developer Guide.
+Note that client.rack can also be used to distribute 
standby tasks on different "rack" from the active ones, which has a similar 
functionality as rack.aware.assignment.tags.
+Currently, rack.aware.assignment.tag takes precedence in 
distributing standby tasks which means if both configs present, 
rack.aware.assignment.tag will be used for distributing
+standby tasks on different "rack" from the active ones because it can 
configure more tag keys.

Review Comment:
   ```suggestion
   standby tasks on different racks from the active ones because it can 
configure more tag keys.
   ```



##
docs/streams/architecture.html:
##
@@ -167,6 +167,14 @@ Kafka
 Streams Developer Guide section.
 
+
+There is also a client side config client.rack which can 
set the rack for a Kafka Consumer. If broker side also have rack set via 
broker.rack. Then rack aware task
+assignment can be enabled to compute a task assignment which can 
reduce cross rack traffic by try to assign tasks to clients with the same rack. 
See rack.aware.assignment.strategy in
+Kafka
 Streams Developer Guide.

Review Comment:
   Drop this line



##
docs/streams/architecture.html:
##
@@ -167,6 +167,14 @@ Kafka
 Streams Developer Guide section.
 
+
+There is also a client side config client.rack which can 
set the rack for a Kafka Consumer. If broker side also have rack set via 
broker.rack. Then rack aware task
+assignment can be enabled to compute a task assignment which can 
reduce cross rack traffic by try to assign tasks to clients with the same rack. 
See rack.aware.assignment.strategy in

Review Comment:
   ```suggestion
   assignment can be enabled via 
rack.aware.assignment.strategy (cf.  Kafka
 Streams Developer Guide) to compute a task assignment which can reduce 
cross rack traffic by trying to assign tasks to clients with the same rack.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290483313


##
docs/streams/developer-guide/config-streams.html:
##
@@ -685,6 +688,45 @@ default.windowed.value.serde.innerThis is discussed in more detail in Data types and serialization.
 
 
+  
+rack.aware.assignment.non_overlap_cost
+
+  
+
+  This configuration sets the cost of moving a task from the 
original assignment computed either by StickyTaskAssignor or
+  HighAvailabilityTaskAssignor. Together with rack.aware.assignment.traffic_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.traffic_cost,
+  the optimizer will try to maintain the existing assignment 
computed by the task assignor. Note that the optimizer takes the ratio of these 
two configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.non_overlap_cost to 10 and 
rack.aware.assignment.traffic_cost to 1 is more 
likely to maintain existing assignment than setting
+  rack.aware.assignment.non_overlap_cost to 100 and 
rack.aware.assignment.traffic_cost to 50.
+
+
+  The default value is null which means default non_overlap_cost 
in different assignors will be used. In StickyTaskAssignor, it has a higher default value 
than rack.aware.assignment.traffic_cost which means

Review Comment:
   > it has a higher default value
   
   Should we include the value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290489521


##
docs/streams/developer-guide/config-streams.html:
##
@@ -718,6 +760,24 @@ rack.aware.assignment.tags
+rack.aware.assignment.traffic_cost
+
+  
+
+  This configuration sets the cost of cross rack traffic. Together 
with rack.aware.assignment.non_overlap_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.non_overlap_cost,
+  the optimizer will try to compute an assignment which minimize 
the cross rack traffic. Note that the optimizer takes the ratio of these two 
configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.traffic_cost to 10 and rack.aware.assignment.non_overlap_cost to 1 is more 
likely to minimize cross rack traffic than setting
+  rack.aware.assignment.traffic_cost to 100 and rack.aware.assignment.non_overlap_cost to 50.
+
+
+  The default value is null which means default traffic cost in 
different assignors will be used. In StickyTaskAssignor, it has a lower default value than 
rack.aware.assignment.non_overlap_cost.
+  In HighAvailabilityTaskAssignor, it has a higher default 
value than rack.aware.assignment.non_overlap_cost.

Review Comment:
   Include default value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290485321


##
docs/streams/developer-guide/config-streams.html:
##
@@ -685,6 +688,45 @@ default.windowed.value.serde.innerThis is discussed in more detail in Data types and serialization.
 
 
+  
+rack.aware.assignment.non_overlap_cost
+
+  
+
+  This configuration sets the cost of moving a task from the 
original assignment computed either by StickyTaskAssignor or
+  HighAvailabilityTaskAssignor. Together with rack.aware.assignment.traffic_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.traffic_cost,
+  the optimizer will try to maintain the existing assignment 
computed by the task assignor. Note that the optimizer takes the ratio of these 
two configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.non_overlap_cost to 10 and 
rack.aware.assignment.traffic_cost to 1 is more 
likely to maintain existing assignment than setting
+  rack.aware.assignment.non_overlap_cost to 100 and 
rack.aware.assignment.traffic_cost to 50.
+
+
+  The default value is null which means default non_overlap_cost 
in different assignors will be used. In StickyTaskAssignor, it has a higher default value 
than rack.aware.assignment.traffic_cost which means
+  maintaining stickiness is preferred in StickyTaskAssignor. In HighAvailabilityTaskAssignor, it has a lower default 
value than rack.aware.assignment.traffic_cost
+  which means minimizing cross rack traffic is preferred in HighAvailabilityTaskAssignor.
+
+  
+
+  
+  
+rack.aware.assignment.strategy
+
+  
+
+  This configuration sets the strategy Kafka Streams can use for 
rack aware task assignment so that cross traffic from broker to client can be 
reduced. This config will only take effect when broker.rack

Review Comment:
   ```suggestion
 This configuration sets the strategy Kafka Streams uses for 
rack aware task assignment so that cross traffic from broker to client can be 
reduced. This config will only take effect when broker.rack
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290482358


##
docs/streams/developer-guide/config-streams.html:
##
@@ -685,6 +688,45 @@ default.windowed.value.serde.innerThis is discussed in more detail in Data types and serialization.
 
 
+  
+rack.aware.assignment.non_overlap_cost
+
+  
+
+  This configuration sets the cost of moving a task from the 
original assignment computed either by StickyTaskAssignor or
+  HighAvailabilityTaskAssignor. Together with rack.aware.assignment.traffic_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.traffic_cost,
+  the optimizer will try to maintain the existing assignment 
computed by the task assignor. Note that the optimizer takes the ratio of these 
two configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.non_overlap_cost to 10 and 
rack.aware.assignment.traffic_cost to 1 is more 
likely to maintain existing assignment than setting
+  rack.aware.assignment.non_overlap_cost to 100 and 
rack.aware.assignment.traffic_cost to 50.
+
+
+  The default value is null which means default non_overlap_cost 
in different assignors will be used. In StickyTaskAssignor, it has a higher default value 
than rack.aware.assignment.traffic_cost which means

Review Comment:
   put `non_overlap_cost` into code markup



##
docs/streams/architecture.html:
##
@@ -167,6 +167,14 @@ Kafka
 Streams Developer Guide section.
 
+
+There is also a client side config client.rack which can 
set the rack for a Kafka Consumer. If broker side also have rack set via 
broker.rack. Then rack aware task

Review Comment:
   ```suggestion
   There is also a client config client.rack which can set 
the rack for a Kafka consumer. If brokers also have their rack set via 
broker.rack, then rack aware task
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] mjsax commented on a diff in pull request #14181: KAFKA-15022: [10/N] docs for rack aware assignor

2023-08-10 Thread via GitHub


mjsax commented on code in PR #14181:
URL: https://github.com/apache/kafka/pull/14181#discussion_r1290483692


##
docs/streams/developer-guide/config-streams.html:
##
@@ -685,6 +688,45 @@ default.windowed.value.serde.innerThis is discussed in more detail in Data types and serialization.
 
 
+  
+rack.aware.assignment.non_overlap_cost
+
+  
+
+  This configuration sets the cost of moving a task from the 
original assignment computed either by StickyTaskAssignor or
+  HighAvailabilityTaskAssignor. Together with rack.aware.assignment.traffic_cost,
+  they control whether the optimizer favors minimizing cross rack 
traffic or minimizing the movement of tasks in the existing assignment. If this 
config is set to a larger value than rack.aware.assignment.traffic_cost,
+  the optimizer will try to maintain the existing assignment 
computed by the task assignor. Note that the optimizer takes the ratio of these 
two configs into consideration of favoring maintaining existing assignment or 
minimizing traffic cost. For example, setting
+  rack.aware.assignment.non_overlap_cost to 10 and 
rack.aware.assignment.traffic_cost to 1 is more 
likely to maintain existing assignment than setting
+  rack.aware.assignment.non_overlap_cost to 100 and 
rack.aware.assignment.traffic_cost to 50.
+
+
+  The default value is null which means default non_overlap_cost 
in different assignors will be used. In StickyTaskAssignor, it has a higher default value 
than rack.aware.assignment.traffic_cost which means
+  maintaining stickiness is preferred in StickyTaskAssignor. In HighAvailabilityTaskAssignor, it has a lower default 
value than rack.aware.assignment.traffic_cost

Review Comment:
   > it has a lower default value
   
   As above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org