317brian commented on code in PR #13993:
URL: https://github.com/apache/druid/pull/13993#discussion_r1152522377
##########
docs/configuration/index.md:
##########
@@ -1223,52 +1223,80 @@ A sample worker config spec is shown below:
}
```
-Issuing a GET request at the same URL will return the current worker config
spec that is currently in place. The worker config spec list above is just a
sample for EC2 and it is possible to extend the code base for other deployment
environments. A description of the worker config spec is shown below.
+Issuing a GET request at the same URL will return the current Overlord dynamic
config spec.
-|Property|Description|Default|
-|--------|-----------|-------|
-|`selectStrategy`|How to assign tasks to MiddleManagers. Choices are
`fillCapacity`, `equalDistribution`, and `javascript`.|equalDistribution|
-|`autoScaler`|Only used if autoscaling is enabled. See below.|null|
+|Property| Description
| Default |
+|--------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
+|`selectStrategy`| Desctibes how to assign tasks to MiddleManagers. The type
can be `equalDistribution`, `equalDistributionWithCategorySpec`,
`fillCapacity`, `fillCapacityWithCategorySpec`, and `javascript`. |
`{"type":"equalDistribution"} |
+|`autoScaler`| Only used if autoscaling is enabled. See below.
| null |
To view the audit history of worker config issue a GET request to the URL -
```
http://<OVERLORD_IP>:<port>/druid/indexer/v1/worker/history?interval=<interval>
```
-default value of interval can be specified by setting
`druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Overlord
runtime.properties.
+The default value of `interval` can be specified by setting
`druid.audit.manager.auditHistoryMillis` (1 week if not configured) in Overlord
runtime.properties.
To view last `n` entries of the audit history of worker config issue a GET
request to the URL -
```
http://<OVERLORD_IP>:<port>/druid/indexer/v1/worker/history?count=<n>
```
-##### Worker Select Strategy
+##### Worker select strategy
+
+The select strategy controls how Druid assigns tasks to workers
(MiddleManagers).
+At a high level a select strategy determines the list of possible workers that
the task can be assigned to using
+either an `affinityConfig` or a `categorySpec` and then it assigns the task by
either trying to distribute load equally
+(`equalDistribution`) or to fill as many workers as possible to capacity
(`fillCapacity`).
+This forms 4 possible options for supported select strategies.
+A `javascript` option is also available which should only be used for
prototyping new strategies.
+
+If an `affinityConfig` is provided (as part of `fillCapacity` and
`equalDistribution` strategies) for a given task the list of workers eligible
to be assigned is determined as follows:
+
+- a non-affinity worker, if no affinity is specified for that datasource (any
worker not listed in the `affinityConfig` is known as a "Non-affinity worker")
+- a non-affinity worker, if preferred workers are not available and affinity
is `weak`
+- a preferred worker (based on affinityConfig), if available
+- not assigned at all (i.e. remains pending), if preferred MMs are not
available and affinity is `strong`
+
+Note that every worker listed in the `affinityConfig` will only be used for
the assigned datasources and no other.
+
+If a `categorySpec` is provided (as part of `fillCapacityWithCategorySpec` and
`equalDistributionWithCategorySpec` strategies) for a given task the list of
workers eligible to be assigned is determined as follows:
-Worker select strategies control how Druid assigns tasks to MiddleManagers.
+- any worker, if no categoryConfig given for task type
+- any worker, if categoryConfig given for task type but no category given for
datasource and no default category either
+- a preferred worker (based on categoryConfig + category for datasource), if
available
+- any worker, if categoryConfig given and category given but no preferred
worker is available and categoryConfig is `weak`
+- not assigned at all, if preferred workers are not available and
`categoryConfig` is `strong`
-###### Equal Distribution
+In either case, after the eligible worker list is constructed one will be
selected depending on their load with the goal of either distributing the load
equally or filling as few workers as possible.
-Tasks are assigned to the MiddleManager with the most free slots at the time
the task begins running. This is useful if
-you want work evenly distributed across your MiddleManagers.
+If you are using auto-scaling it only makes sense to use the `fillCapacity`
select strategy since auto-scaled nodes can
+not be assigned a category, and you want the work to be concentrated on the
fewest number of workers to allow the empty ones to scale down.
+
+###### `equalDistribution`
+
+Tasks are assigned to the MiddleManager with the most free slots at the time
the task begins running.
+This will evenly distribute work across your MiddleManagers.
|Property|Description|Default|
|--------|-----------|-------|
-|`type`|`equalDistribution`.|required; must be `equalDistribution`|
+|`type`|`equalDistribution`|required; must be `equalDistribution`|
|`affinityConfig`|[Affinity config](#affinity) object|null (no affinity)|
Review Comment:
```suggestion
|`affinityConfig`|[Affinity config](#affinityconfig) object|null (no
affinity)|
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]