raghavyadav01 opened a new pull request, #18852:
URL: https://github.com/apache/pinot/pull/18852
## Motivation
Today a vector index is written as a per-column sidecar in the segment
directory: a flat
`.vector.ivfflat.index` / `.vector.ivfpq.index` file for IVF, or a Lucene
HNSW directory for
HNSW (plus the lazily-built `.vector.hnsw.mapping` file). Every vector
column adds extra
inodes per segment, and on large clusters with many vector columns the file
count balloons.
The text index already solved this by packing its Lucene directory into
`columns.psf` (the
segment's single-file index buffer) via the V2→V3 converter. This PR adds
the same pattern
to IVF and HNSW vector indexes — opt-in via a new `storeInSegmentFile` flag
on
`VectorIndexConfig`, default `false` so behaviour is unchanged.
End-to-end verification (against MinIO via an S3-tier offline build, same
vector data
indexed twice — once with the flag, once without):
| Form | Files per segment on S3 | Cold-mmap query | HNSW prune
(numDocsScanned/total) |
|---|---|---|---|
| Sidecar (`storeInSegmentFile=false`) | 9 | ~22 ms | 10 / 400 |
| Consolidated (`storeInSegmentFile=true`) | 4 | ~6 ms | 10 / 400 |
The 5 Lucene HNSW files (`_0.cfe`, `_0.cfs`, `_0.si`, `segments_1`,
`write.lock`) and the
mapping file are gone in the consolidated form — those bytes are a typed
entry inside
`columns.psf` (verified via `index_map`:
`embedding.vector_index.startOffset=48, size=12034`).
## What changes
### IVF (flat file)
- **Readers consume `PinotDataBuffer`.** `IvfFlatVectorIndexReader`,
`IvfOnDiskVectorIndexReader`,
and `IvfPqVectorIndexReader` now read from a `PinotDataBuffer` slice. New
`IvfCombinedBuffers`
helper opens the buffer for both the consolidated path (slice owned by the
segment directory)
and the legacy path (mmap of the sidecar). Readers honour an `ownsBuffer`
flag so borrowed
slices (`columns.psf` views) are not closed by the reader's `close()`.
- **Build path.** `IvfFlatVectorIndexCreator`, `IvfPqVectorIndexCreator`
write a transient
`.combined.index` file when the flag is on. The V2→V3 converter picks it
up via the standard
`copyIndexIfExists` loop and packs it into `columns.psf`.
### HNSW (Lucene directory)
- **Build-time consolidation** via new `HnswVectorIndexCombined`: packs the
Lucene HNSW
directory's files plus the docId mapping (when present) into a single
`.vector.hnsw.combined.index` file using the same `LUCENE_V2` layout the
text index uses.
Reuses `LuceneCombinedTextIndexConstants` — no duplicate format definition.
- **Read path** via new `HnswVectorIndexBufferReader`: parses the combined
buffer and returns
a Lucene `Directory` built on `PinotBufferLuceneDirectory` +
`PinotBufferIndexInput` (reused
from the text-index package — `KnnVectorsReader` uses the same `Directory`
interface, so no
HNSW-specific Directory adapter is needed).
- `HnswVectorIndexReader` gains a `PinotDataBuffer`-based constructor. The
`DocIdTranslator`
reads its mapping from inside the combined buffer when available, or
rebuilds it in heap
when not.
### Shared
- **Bidirectional migration** in `VectorIndexHandler`: forward (sidecar →
`columns.psf`) and
reverse (`columns.psf` → sidecar) on a flag toggle, without rebuilding the
index. For HNSW
the absorb step first packs the Lucene directory into a transient combined
file; the extract
step unpacks the combined entry back into a Lucene directory.
- **Crash recovery.** `absorbCombinedIntoColumnsPsf` detects a
previously-crashed absorb via
`hasIndexFor` + size check on the typed entry — no exception-message
matching. Size mismatch
refuses with `IOException` rather than guessing which copy is
authoritative.
- **Orphan cleanup.** A crash with `storeInSegmentFile=true` followed by a
retry with the
flag off leaves a stale combined file + `.inprogress` marker; the next
build sweeps both
before starting. Symmetric in the other direction. Works for both IVF
(flat orphan) and HNSW
(legacy Lucene directory orphan, recursive delete).
- **HNSW extract ordering.** Unpack into Lucene directory runs BEFORE
`removeIndex`. A crash
mid-unpack leaves the typed entry in `columns.psf` and the next handler
run retries — no
permanent loss.
- **Mixed-state convert.** If a V1/V2 segment somehow has both a legacy
sidecar and a
combined file for the same column, the V2→V3 converter packs the combined
file and drops
the legacy sibling — combined wins.
### SPI / utility plumbing
- `V1Constants.Indexes` gains
`VECTOR_IVF_FLAT_COMBINED_INDEX_FILE_EXTENSION`,
`VECTOR_IVF_PQ_COMBINED_INDEX_FILE_EXTENSION`,
`VECTOR_HNSW_COMBINED_INDEX_FILE_EXTENSION`.
- `SegmentDirectoryPaths.findVectorIndexIndexFile` probes both legacy and
combined
extensions for IVF and HNSW. Legacy wins on tie (stable behaviour for
unchanged segments).
- `VectorIndexUtils.hasCombinedFormVectorIndex` is a new predicate the
converter uses;
`hasVectorIndex` deliberately excludes the combined form so the converter
knows to pack it
rather than preserve it as a sibling.
- `SingleFileIndexDirectory` symmetry fixes in `hasIndexFor`,
`getColumnsWithIndex`,
`removeIndex` — vector entries are now treated uniformly whether they live
as a sidecar or
as a `_columnEntries` slot.
## Backward compatibility
- Default behaviour unchanged. `storeInSegmentFile=false` preserves the
legacy sidecar /
Lucene-directory layout exactly. Existing segments load and serve
identically.
- No segment-format or wire-protocol break. The `.combined.index` extensions
are build-time
transient and never appear in a V3 segment after the converter runs.
- Mixed-version safe: a server running this code reads a segment built
without the flag
exactly as before; the new code only fires when the flag is set in the
table config.
## Tests
- **132 tests green** (130 unit + 2 integration). New classes:
- `IvfFlatConsolidatedVectorTest` — end-to-end integration: build with
flag=true, load,
run `vectorSimilarity` query, including a `MAP<STRING,STRING>` column to
exercise the
S3A read path.
- `IvfDiskFormatInspectionTest` — byte-level header dumps validate creator
output
matches the documented IVF format unchanged in both extensions.
- `IvfReaderFailurePathTest` — mmap-count tracking proves borrowed buffers
are not
closed by the reader (`ownsBuffer=false`) and that constructor failures
still release
owned buffers.
- `HnswVectorIndexCombinedTest` — pack/unpack round-trip plus an
end-to-end HNSW build
read via the buffer-backed `Directory`.
- `VectorIndexHandlerTest` — bidirectional migration for both IVF and HNSW,
crash-recovery, size-mismatch refusal,
extract-refusal-when-sidecar-exists, orphan
sweep in both directions.
- Spotless / checkstyle / license clean.
## Related
The buffer-backed Lucene `Directory` plumbing (`PinotBufferLuceneDirectory`,
`PinotBufferIndexInput`, `LuceneCombinedTextIndexConstants`) is reused as-is
from the
text-index implementation — this PR adds zero new code in those generic
helpers.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]