paul-rogers opened a new pull request #12222: URL: https://github.com/apache/druid/pull/12222
### Description Druid provides extensive extension support. Extension can define new services, but those services cannot yet be discovered. This PR ensures that they operate like native services. See [this ticket](https://github.com/apache/druid/issues/12218) for details. The current code has a static list of node roles. This PR moves the list into a Guice multi-binder so that extensions can add to the list. Next, the SQL system servers table code is revised to use the Guice-provided list of node roles rather than the hard-coded list. Then, the `/druid/coordinator/v1/cluster` is added to include extension roles after the Druid-defined roles. The code imposed a specific order on the roles; those rules are preserved. Finally, a new endpoint `/druid/router/v1/cluster` is added. The logic here is that clients start with the router endpoint. To get the list of services, they must first get the list of services to get the coordinator. But, they can't get the list of services without first haven gotten that list. To avoid this Catch-22, the new endpoint provides the list of services from the router itself. Of course, the SQL servers system table also provides the list of services. The Catch-22 in this case is that if the cluster is broken, SQL is unavailable to help figure out which services are down. Having the native endpoint provides a reliable fallback: as long as the Router and ZK are up, we can learn about other services and see what is missing. The PR includes some refactoring to make bits of role-related code usable in multiple contexts. Most of the code is not unit testable (we can't run servers in unit tests), but where it is testable, tests are added or modified. <hr> This PR has: - [X] been self-reviewed. - [X] added documentation for new or modified features or behaviors. - [X] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [X] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [X] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [X] been tested in a test Druid cluster, in the context of the actual extension service which this PR supports. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
