[PR] Centralize table schema building in Coordinator by relocating SegmentMetadataCache (druid)

via GitHub Thu, 14 Sep 2023 01:11:53 -0700


findingrish opened a new pull request, #14985:
URL: https://github.com/apache/druid/pull/14985

## Description

Original proposal
https://github.com/abhishekagarwal87/druid/blob/metadata_design_proposal/design-proposal.md#3-proposed-solution-storing-segment-schema-in-metadata-store

In the current design, brokers query both data nodes and tasks to fetch the
schema of the segments they serve. The table schema is then constructed by
combining the schemas of all segments within a dataSource. However, this
approach leads to a high number of segment metadata queries during broker
startup, resulting in slow startup times and various issues outlined in the
design proposal.

To address these challenges, we propose centralizing the table schema
management process within the coordinator. This change is the first step in
that direction. In the new arrangement, the coordinator will take on the
responsibility of querying both data nodes and tasks to fetch segment schema
and subsequently building the table schema. Brokers will now simply query the
Coordinator to fetch table schema. Importantly, brokers will still retain the
capability to build table schemas if the need arises, ensuring both flexibility
and resilience.

## Design

### Coordinator changes

These changes enhance the coordinator's capabilities, enabling it to manage
table schema and provide schema related information through its APIs.

**SegmentMetadataCache refactor**: We introduce a new class,
`CoordinatorSegmentMetadataCache`, extending `AbstractSegmentMetadataCache`,
which manages the cache and forms the core of the schema management process.

**Changes to Coordinator timeline**: A new interface, `CoordinatorTimeline`,
is introduced, and the existing coordinator timeline, `CoordinatorServerView`,
now implements it. Additionally, we add a new implementation,
`QueryableCoordinatorServerView`, which extends `BrokerServerView`. This
facilitates running segment metadata queries on the coordinator using
`CachingClusteredClient`.

**API Changes**:
- We introduce a new container class, `DataSourceInformation`, for handling
`RowSignature` information. An API is exposed to retrieve dataSource
information.
- The existing api, responsible for returning used segments is modified to
accept a new query parameter called `includeRealtimeSegments`. When set, this
parameter will include realtime segments in the result. Additionally, we
enhance the return object `SegmentStatusInCluster`, to provide additional
information regarding `realtime` and `numRows`.

**Binding for Query Modules**: We add essential bindings for required query
modules in `CliCoordinator`.

### Broker changes

Broker now defaults to querying the Coordinator for fetching table schema,
eliminating the need for querying data nodes and tasks for segment schema.
Additionally, there are changes in the logic for building sys segments table.

**Broker-side SegmentMetadataCache**: We introduce a new implementation of
`AbstractSegmentMetadataCache` called `BrokerSegmentMetadataCache`. This
implementation queries the coordinator for table schema and falls back to
running a segment metadata query when needed. It also assumes the
responsibility for building `PhysicalDataSourceMetadata`, for which a new class
is introduced.

**System schema changes**: In `MetadataSegmentView`, requests for realtime
segments are now included when polling the coordinator. The segment table
building logic is updated accordingly.

### Upgrade considerations
The general upgrade order should be followed. The new code is behind a
feature flag, so it is compatible with existing setups. Brokers with the new
changes can communicate with old Coordinators without issues.

### Release note
This feature is experimental and addresses multiple challenges outlined in
the description. To enable it, set
`druid.coordinator.segmentMetadataCache.enabled` in Coordinator configurations.
<hr>

This PR has:

- [ ] been self-reviewed.
- [ ] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [ ] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Centralize table schema building in Coordinator by relocating SegmentMetadataCache (druid)

Reply via email to