Hi Iceberg community, I’d like to request feedback on a proposal <https://docs.google.com/document/d/1x-0KT43aTrt8u6EV7EgSietIFQSkGsocqwnBTHPebRU/edit?tab=t.0> to introduce secondary indexes to Apache Iceberg with a narrow, incremental scope.
Phase 1 adds file-skipping indexes based on per-column Bloom filters, stored in Puffin and referenced from table metadata so query engines can use them during planning to prune data files. Indexes are advisory-only and snapshot-scoped. The proposal is fully backward compatible: engines that don’t understand the new metadata fields ignore them. I’d appreciate any feedback, questions, or concerns on the overall direction and design. Best, Huaxin
