merlimat opened a new pull request, #25731:
URL: https://github.com/apache/pulsar/pull/25731

   ## Motivation
   
   PIP-473 (metadata-driven transactions for scalable topics) needs to scan 
secondary indexes by **range**, not just point lookup — the 
`idx:txn-by-deadline` sweep finds open txns past their deadline (`key <= now`), 
and the GC sweep finds finalized txns past their retention window. The existing 
`MetadataStore.findByIndex` is point-only and returns a materialized 
`List<GetResult>`, which doesn't fit either use case.
   
   This PR replaces `findByIndex` with a single streaming method that subsumes 
both shapes.
   
   ## Modifications
   
   **API change** on `MetadataStore`:
   
   ```java
   CompletableFuture<Void> scanByIndex(
       String scanPathPrefix, String indexName,
       String fromKeyInclusive, String toKeyInclusive,  // null = unbounded on 
that side
       Predicate<GetResult> fallbackFilter,
       ScanConsumer consumer,
       Set<Option> opts)
   ```
   
   - Both bounds are **inclusive**. Point lookup is `fromKey == toKey == key`. 
Half-open ranges use `null`. Listener shape matches `scanChildren`'s existing 
`ScanConsumer`.
   - `findByIndex` is removed in the same PR — one method, no separate 
point-lookup variant. All call sites are migrated.
   
   **Oxia (native)**: `client.list(start, end, ListOption.UseIndex(indexName))` 
returns matching primary keys; results are then fetched via `storeGet` and 
streamed to the consumer. Range translation: `[scanPathPrefix/fromKey, 
scanPathPrefix/toKey + "~")` — the trailing `~` sentinel widens Oxia's 
half-open range to include `toKey` for the secondary-key shapes callers use 
today (numeric timestamps, fixed-tag enums). Documented in code.
   
   **ZK / Memory / RocksDB compatibility layer** in `AbstractMetadataStore`: 
lists children under `scanPathPrefix`, fetches each, streams those matching 
`fallbackFilter`. The fallback ignores `fromKey/toKey` — callers encode the 
range in `fallbackFilter`.
   
   **`ScanConsumer.collectInto(List)`** static helper for callers that prefer 
the materialized-list shape on top of the streaming API.
   
   **Migrated callers**:
   - `ScalableTopicResources.findScalableTopicsAsync` (broker)
   - `MetadataStoreSecondaryIndexTest`, `MetadataCacheSecondaryIndexTest` 
(added a small in-test `findByIndex` helper that wraps `scanByIndex` + 
`collectInto`)
   - `DualMetadataStore` wrapper
   
   ## Verifying this change
   
   - All existing `findByIndex*` tests migrated to `scanByIndex` + 
`collectInto` and continue to pass on every backend (ZK, Memory, RocksDB, Oxia, 
MockZooKeeper).
   - New `scanByIndexInclusiveRange` and `scanByIndexUnboundedFromKey` cover 
the range-query shape end-to-end (zero-padded numeric secondary keys mirroring 
PIP-473's `idx:txn-by-deadline`).
   
   Local results: full `:pulsar-metadata:test` runs cleanly modulo the 
recurring `BookKeeper.AuditorLedgerCheckerTest.setUp` `BindException` flake 
(port collision; unrelated to this change).
   
   ## Does this pull request potentially affect one of the following parts:
   
   - [x] The public `MetadataStore` API — `findByIndex` is removed and replaced 
with `scanByIndex`. All in-tree callers are migrated in this PR. External 
implementations of `MetadataStore` would need the same migration; the new 
method has a default that returns a not-supported failure, matching the 
previous default behavior.
   
   ## Matching PR in forked repository
   
   PR in forked repository: 
https://github.com/merlimat/pulsar/pull/new/mmerli/metadata-scan-by-index


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to