317brian commented on code in PR #13993:
URL: https://github.com/apache/druid/pull/13993#discussion_r1152522377


##########
docs/configuration/index.md:
##########
@@ -1223,52 +1223,80 @@ A sample worker config spec is shown below:
 }
 ```
 
-Issuing a GET request at the same URL will return the current worker config 
spec that is currently in place. The worker config spec list above is just a 
sample for EC2 and it is possible to extend the code base for other deployment 
environments. A description of the worker config spec is shown below.
+Issuing a GET request at the same URL will return the current Overlord dynamic 
config spec.
 
-|Property|Description|Default|
-|--------|-----------|-------|
-|`selectStrategy`|How to assign tasks to MiddleManagers. Choices are 
`fillCapacity`, `equalDistribution`, and `javascript`.|equalDistribution|
-|`autoScaler`|Only used if autoscaling is enabled. See below.|null|
+|Property| Description                                                         
                                                                                
                                         | Default                       |
+|--------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
+|`selectStrategy`| Desctibes how to assign tasks to MiddleManagers. The type 
can be `equalDistribution`, `equalDistributionWithCategorySpec`, 
`fillCapacity`, `fillCapacityWithCategorySpec`, and `javascript`. | 
`{"type":"equalDistribution"} |
+|`autoScaler`| Only used if autoscaling is enabled. See below.                 
                                                                                
                                             | null                          |
 
 To view the audit history of worker config issue a GET request to the URL -
 
 ```
 http://<OVERLORD_IP>:<port>/druid/indexer/v1/worker/history?interval=<interval>
 ```
 
-default value of interval can be specified by setting 
`druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Overlord 
runtime.properties.
+The default value of `interval` can be specified by setting 
`druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Overlord 
runtime.properties.
 
 To view last `n` entries of the audit history of worker config issue a GET 
request to the URL -
 
 ```
 http://<OVERLORD_IP>:<port>/druid/indexer/v1/worker/history?count=<n>
 ```
 
-##### Worker Select Strategy
+##### Worker select strategy
+
+The select strategy controls how Druid assigns tasks to workers 
(MiddleManagers).
+At a high level a select strategy determines the list of possible workers that 
the task can be assigned to using
+either an `affinityConfig` or a `categorySpec` and then it assigns the task by 
either trying to distribute load equally
+(`equalDistribution`) or to fill as many workers as possible to capacity 
(`fillCapacity`).
+This forms 4 possible options for supported select strategies.
+A `javascript` option is also available which should only be used for 
prototyping new strategies.
+
+If an `affinityConfig` is provided (as part of `fillCapacity` and 
`equalDistribution` strategies) for a given task the list of workers eligible 
to be assigned is determined as follows:
+
+- a non-affinity worker, if no affinity is specified for that datasource (any 
worker not listed in the `affinityConfig` is known as a "Non-affinity worker")
+- a non-affinity worker, if preferred workers are not available and affinity 
is `weak`
+- a preferred worker (based on affinityConfig), if available
+- not assigned at all (i.e. remains pending), if preferred MMs are not 
available and affinity is `strong`
+
+Note that every worker listed in the `affinityConfig` will only be used for 
the assigned datasources and no other.
+
+If a `categorySpec` is provided (as part of `fillCapacityWithCategorySpec` and 
`equalDistributionWithCategorySpec` strategies) for a given task the list of 
workers eligible to be assigned is determined as follows:
 
-Worker select strategies control how Druid assigns tasks to MiddleManagers.
+- any worker, if no categoryConfig given for task type
+- any worker, if categoryConfig given for task type but no category given for 
datasource and no default category either
+- a preferred worker (based on categoryConfig + category for datasource), if 
available
+- any worker, if categoryConfig given and category given but no preferred 
worker is available and categoryConfig is `weak`
+- not assigned at all, if preferred workers are not available and 
`categoryConfig` is `strong`
 
-###### Equal Distribution
+In either case, after the eligible worker list is constructed one will be 
selected depending on their load with the goal of either distributing the load 
equally or filling as few workers as possible.
 
-Tasks are assigned to the MiddleManager with the most free slots at the time 
the task begins running. This is useful if
-you want work evenly distributed across your MiddleManagers.
+If you are using auto-scaling it only makes sense to use the `fillCapacity` 
select strategy since auto-scaled nodes can
+not be assigned a category, and you want the work to be concentrated on the 
fewest number of workers to allow the empty ones to scale down.
+
+###### `equalDistribution`
+
+Tasks are assigned to the MiddleManager with the most free slots at the time 
the task begins running.
+This will evenly distribute work across your MiddleManagers.
 
 |Property|Description|Default|
 |--------|-----------|-------|
-|`type`|`equalDistribution`.|required; must be `equalDistribution`|
+|`type`|`equalDistribution`|required; must be `equalDistribution`|
 |`affinityConfig`|[Affinity config](#affinity) object|null (no affinity)|

Review Comment:
   ```suggestion
   |`affinityConfig`|[Affinity config](#affinityconfig) object|null (no 
affinity)|
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to