merlimat opened a new pull request, #25731:
URL: https://github.com/apache/pulsar/pull/25731
## Motivation
PIP-473 (metadata-driven transactions for scalable topics) needs to scan
secondary indexes by **range**, not just point lookup — the
`idx:txn-by-deadline` sweep finds open txns past their deadline (`key <= now`),
and the GC sweep finds finalized txns past their retention window. The existing
`MetadataStore.findByIndex` is point-only and returns a materialized
`List<GetResult>`, which doesn't fit either use case.
This PR replaces `findByIndex` with a single streaming method that subsumes
both shapes.
## Modifications
**API change** on `MetadataStore`:
```java
CompletableFuture<Void> scanByIndex(
String scanPathPrefix, String indexName,
String fromKeyInclusive, String toKeyInclusive, // null = unbounded on
that side
Predicate<GetResult> fallbackFilter,
ScanConsumer consumer,
Set<Option> opts)
```
- Both bounds are **inclusive**. Point lookup is `fromKey == toKey == key`.
Half-open ranges use `null`. Listener shape matches `scanChildren`'s existing
`ScanConsumer`.
- `findByIndex` is removed in the same PR — one method, no separate
point-lookup variant. All call sites are migrated.
**Oxia (native)**: `client.list(start, end, ListOption.UseIndex(indexName))`
returns matching primary keys; results are then fetched via `storeGet` and
streamed to the consumer. Range translation: `[scanPathPrefix/fromKey,
scanPathPrefix/toKey + "~")` — the trailing `~` sentinel widens Oxia's
half-open range to include `toKey` for the secondary-key shapes callers use
today (numeric timestamps, fixed-tag enums). Documented in code.
**ZK / Memory / RocksDB compatibility layer** in `AbstractMetadataStore`:
lists children under `scanPathPrefix`, fetches each, streams those matching
`fallbackFilter`. The fallback ignores `fromKey/toKey` — callers encode the
range in `fallbackFilter`.
**`ScanConsumer.collectInto(List)`** static helper for callers that prefer
the materialized-list shape on top of the streaming API.
**Migrated callers**:
- `ScalableTopicResources.findScalableTopicsAsync` (broker)
- `MetadataStoreSecondaryIndexTest`, `MetadataCacheSecondaryIndexTest`
(added a small in-test `findByIndex` helper that wraps `scanByIndex` +
`collectInto`)
- `DualMetadataStore` wrapper
## Verifying this change
- All existing `findByIndex*` tests migrated to `scanByIndex` +
`collectInto` and continue to pass on every backend (ZK, Memory, RocksDB, Oxia,
MockZooKeeper).
- New `scanByIndexInclusiveRange` and `scanByIndexUnboundedFromKey` cover
the range-query shape end-to-end (zero-padded numeric secondary keys mirroring
PIP-473's `idx:txn-by-deadline`).
Local results: full `:pulsar-metadata:test` runs cleanly modulo the
recurring `BookKeeper.AuditorLedgerCheckerTest.setUp` `BindException` flake
(port collision; unrelated to this change).
## Does this pull request potentially affect one of the following parts:
- [x] The public `MetadataStore` API — `findByIndex` is removed and replaced
with `scanByIndex`. All in-tree callers are migrated in this PR. External
implementations of `MetadataStore` would need the same migration; the new
method has a default that returns a not-supported failure, matching the
previous default behavior.
## Matching PR in forked repository
PR in forked repository:
https://github.com/merlimat/pulsar/pull/new/mmerli/metadata-scan-by-index
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]