paul-rogers opened a new pull request #12222:
URL: https://github.com/apache/druid/pull/12222


   ### Description
   
   Druid provides extensive extension support. Extension can define new 
services, but those services cannot yet be discovered. This PR ensures that 
they operate like native services. See [this 
ticket](https://github.com/apache/druid/issues/12218) for details.
   
   The current code has a static list of node roles. This PR moves the list 
into a Guice multi-binder so that extensions can add to the list.
   
   Next, the SQL system servers table code is revised to use the Guice-provided 
list of node roles rather than the hard-coded list.
   
   Then, the `/druid/coordinator/v1/cluster` is added to include extension 
roles after the Druid-defined roles. The code imposed a specific order on the 
roles; those rules are preserved.
   
   Finally, a new endpoint `/druid/router/v1/cluster` is added. The logic here 
is that clients start with the router endpoint. To get the list of services, 
they must first get the list of services to get the coordinator. But, they 
can't get the list of services without first haven gotten that list. To avoid 
this Catch-22, the new endpoint provides the list of services from the router 
itself.
   
   Of course, the SQL servers system table also provides the list of services. 
The Catch-22 in this case is that if the cluster is broken, SQL is unavailable 
to help figure out which services are down. Having the native endpoint provides 
a reliable fallback: as long as the Router and ZK are up, we can learn about 
other services and see what is missing.
   
   The PR includes some refactoring to make bits of role-related code usable in 
multiple contexts. Most of the code is not unit testable (we can't run servers 
in unit tests), but where it is testable, tests are added or modified.
   
   <hr>
   
   This PR has:
   - [X] been self-reviewed.
   - [X] added documentation for new or modified features or behaviors.
   - [X] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [X] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [X] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [X] been tested in a test Druid cluster, in the context of the actual 
extension service which this PR supports.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to