[DISCUSS] Secondary Indexes (Phase 1): Bloom filter skipping index (Puffin, snapshot-scoped)

huaxin gao Thu, 08 Jan 2026 08:27:45 -0800

Hi Iceberg community,

I’d like to request feedback on a proposal
<https://docs.google.com/document/d/1x-0KT43aTrt8u6EV7EgSietIFQSkGsocqwnBTHPebRU/edit?tab=t.0>
to introduce secondary indexes to Apache Iceberg with a narrow, incremental
scope.


Phase 1 adds file-skipping indexes based on per-column Bloom filters,
stored in Puffin and referenced from table metadata so query engines can
use them during planning to prune data files. Indexes are advisory-only and
snapshot-scoped. The proposal is fully backward compatible: engines that
don’t understand the new metadata fields ignore them.

I’d appreciate any feedback, questions, or concerns on the overall
direction and design.

Best,

Huaxin

[DISCUSS] Secondary Indexes (Phase 1): Bloom filter skipping index (Puffin, snapshot-scoped)

Reply via email to