xiangfu0 opened a new pull request, #18480:
URL: https://github.com/apache/pinot/pull/18480
## Summary
- Reuse existing `tierOverwrites` support for mutable realtime consuming
segments by applying `tierOverwrites.consuming` when building
`RealtimeSegmentConfig`.
- Keep committed/immutable segment generation and real storage-tier loading
on the persisted table config or the actual segment tier.
- Treat `consuming` as a synthetic mutable-consuming tier only when the
table does not already define a real storage tier with that name, preserving
existing storage-tier behavior.
- Validate the effective consuming view and reject unsupported
`tableIndexConfig.tierOverwrites.consuming` keys that do not flow through
mutable index loading.
## User Manual
Configure the committed segment shape as usual, then add
`tierOverwrites.consuming` where the mutable consuming segment should differ.
Example: keep `userId` RAW and without an inverted index after commit, but
use dictionary encoding plus an inverted index while the segment is consuming:
```json
{
"tableIndexConfig": {
"noDictionaryColumns": ["userId"],
"tierOverwrites": {
"consuming": {
"noDictionaryColumns": []
}
}
},
"fieldConfigList": [
{
"name": "userId",
"encodingType": "RAW",
"tierOverwrites": {
"consuming": {
"encodingType": "DICTIONARY",
"indexes": {
"inverted": {
"enabled": true
}
}
}
}
}
]
}
```
Query example:
```sql
SELECT COUNT(*)
FROM userEvents
WHERE userId = 'u123';
```
Notes:
- You normally do not need a `tierConfigs` entry named `consuming`.
- If a table already uses `consuming` as a real storage tier name, Pinot
keeps existing storage-tier semantics and does not treat
`tierOverwrites.consuming` as the synthetic mutable-consuming override for that
table.
- If the persisted column is listed in
`tableIndexConfig.noDictionaryColumns` or `noDictionaryConfig`, clear that
setting under `tableIndexConfig.tierOverwrites.consuming` so the consuming view
can enable dictionary.
- `tableIndexConfig.tierOverwrites.consuming` is limited to index-loading
settings such as dictionary, inverted, range, JSON, Bloom filter, and
dictionary optimization options. Row-shape or ingestion settings such as
`aggregateMetrics` and `segmentPartitionConfig` are rejected for the synthetic
consuming tier.
- Real storage-tier overrides are unchanged and still apply through actual
immutable segment tiers.
A complete sample config and walkthrough are included under
`pinot-tools/src/main/resources/examples/stream/consumingSegmentTierOverride/`.
## Validation
- `./mvnw -pl pinot-segment-local -am
-Dtest=TableConfigConsumingSegmentTierOverrideTest,TableConfigUtilsTest,IndexLoadingConfigTest
-Dsurefire.failIfNoSpecifiedTests=false test`
- `./mvnw -pl pinot-integration-tests -am -Dskip.npm=true
-Dtest=ConsumingSegmentTierOverrideRealtimeTest
-Dsurefire.failIfNoSpecifiedTests=false test`
- `./mvnw spotless:apply -pl
pinot-common,pinot-core,pinot-segment-local,pinot-spi,pinot-integration-tests,pinot-tools`
- `./mvnw license:format -pl
pinot-common,pinot-core,pinot-segment-local,pinot-spi,pinot-integration-tests,pinot-tools`
- `./mvnw checkstyle:check -pl
pinot-common,pinot-core,pinot-segment-local,pinot-spi,pinot-integration-tests,pinot-tools`
- `./mvnw license:check -pl
pinot-common,pinot-core,pinot-segment-local,pinot-spi,pinot-integration-tests,pinot-tools`
- `git diff --check upstream/master...HEAD`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]