Re: [DISCUSS] V4 - indexing support

huaxin gao Thu, 04 Dec 2025 14:19:40 -0800

Hi all,
Miao and I plan to resume the secondary index work. The proposal
<https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0>
was written in 2020 and has been reviewed. We will bring it up to date.


Thanks,
Huaxin

On Mon, Nov 24, 2025 at 7:23 AM Anirban Goswami <[email protected]>
wrote:

> Peter,
>
> That definitely a thing that dragging me awa from the database thing. It
> is one of some thoughts.
>
> How can we get rid of more files that is my driver. Let’s discuss more.
>
> Ani
>
> On 24 Nov 2025, at 7:25 PM, Péter Váry <[email protected]>
> wrote:
>
> Hi Anirban,
>
> I don't really like the dependency on the external database for the index.
> Every reader should be able to access the database, and given a big table
> with several readers, it could become a bottleneck.
>
> I can imagine something similar as part of a REST catalog where the
> catalog is used for planning:
> - The Catalog could decide to read and cache the metadata from the files
> (the cache could be stored in a db, or rocksdb, or whatever)
> - During the planning the Catalog could get the relevant rowgroups, and
> combine back them to a smaller number of splits (if there are
> continuous rowgroups, they could be combined)
> - The users don't need to do anything else, just call the Catalog planning
> API.
>
> In this way, we don't have to change the metadata to get the same gains.
>
> WDYT?
>
> Anirban Goswami <[email protected]> ezt írta (időpont: 2025.
> nov. 18., K, 19:48):
>
>> Thanks Peter.
>>
>> I was also doing some analysis on how to get secondary index in iceberg
>> as we are dealing with several usecases where the table is pretty big and
>> partitions are on different keys. In case we try to query with other keys
>> it is always difficult to get better responses, or say similar response
>> that snowflake or similar system provides by some accelerations or say
>> saerch optimisations methods.
>>
>> Already we have huge metadata load on us and if we try to add idnex as
>> file system then it will be too much to process and maintan as well. I have
>> created one doc with some thougts and want to udnerstand how u look at it.
>>
>> OLTP Database-Backed Index Architecture for Apache Iceberg
>> <https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
>> docs.google.com
>> <https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
>>
>> <https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
>>
>> Regards,
>> Ani
>>
>>
>> On 2025/11/18 11:32:24 Péter Váry wrote:
>> > Hi Team,
>> >
>> > Do we have any progress on this topic? I’d really like to see this move
>> > forward.
>> >
>> > Following Sreeram’s suggestion, we should start collecting the key use
>> > cases we want to support with indexes. Here’s what I’ve heard so far:
>> >
>> >    - *Primary key index*
>> >       - Find a single or few rows by a given primary key
>> >       - Build the Flink “primary key → file_name, position” state by
>> bulk
>> >       reading the primary key index
>> >    - *Secondary index*
>> >       - Range or min/max filtering on columns that are not part of the
>> >       primary key (primary sort order)
>> >    - *Full-text index*
>> >       - Term search in text columns
>> >    - *Vector index*
>> >       - Nearest or approximate nearest neighbor search
>> >    - *Geospatial index*
>> >       - Finding points within a polygon or nearest location
>> >
>> > We should identify a few critical use cases and keep the others in mind
>> > when designing how we store, retrieve, and use these indexes.
>> Personally,
>> > I’d love to see *vector indexes in Iceberg*, enabling fast AI searches
>> on
>> > Iceberg tables.
>> >
>> > For reference, I asked Copilot to collect the currently available index
>> > types in MSSQL, Oracle, Postgres, MySQL, and LanceDB. Here’s the list:
>> >
>> https://docs.google.com/spreadsheets/d/14cBdwsOw89ivolHtAw342YNoGmb1-Kri1E80hwWymL0Thanks
>> > ,
>> >
>> > Peter
>> >
>> >
>> > Aihua Xu <[email protected]> ezt írta (időpont: 2025. nov. 2., V, 4:11):
>> >
>> > > Thanks Steven for raising this topic and giving a summary on the
>> > > proposals. I would like to get involved in this area.
>> > >
>> > > On Fri, Oct 31, 2025 at 4:49 PM huaxin gao <[email protected]> wrote:
>> > >
>> > >> Thanks, Steven, for taking the initiative. I have previously
>> collaborated
>> > >> with Miao from Adobe on secondary index and would like to continue
>> that
>> > >> work.
>> > >>
>> > >> Huaxin
>> > >>
>> > >> On Fri, Oct 31, 2025 at 1:07 PM Xinli shang <[email protected]>
>> > >> wrote:
>> > >>
>> > >>> Thanks Steven for proposing this! This is right direction to go.
>> > >>> Definitely we see challenges in some cases without indexing support,
>> > >>> especially around equality deletes and point lookups. I would like
>> to
>> > >>> contribute as well. One thing we need to be careful is that the
>> overhead of
>> > >>> the index itself like memory usage, index update etc.
>> > >>>
>> > >>> Namratha, for Parquet column index, we had one for Presto
>> > >>> https://www.youtube.com/watch?v=fr_HdhMEa3s.
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> On Fri, Oct 31, 2025 at 11:48 AM namratha mk <[email protected]>
>> wrote:
>> > >>>
>> > >>>> Hi,
>> > >>>>
>> > >>>> I see the point in the doc :
>> > >>>>
>> > >>>> *The primary key index can also be useful for point lookup.*
>> > >>>> But to achieve the above we would need to store native file format
>> > >>>> metadata like parquet page index
>> > >>>> <https://parquet.apache.org/docs/file-format/pageindex/> in the
>> > >>>> primary index which helps in fetching for lookup use case. Has
>> there been
>> > >>>> any talks in the community about this? Would like to get more
>> opinions on
>> > >>>> this.
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Namratha
>> > >>>>
>> > >>>> On Sat, Jul 19, 2025 at 2:39 AM Manish Malhotra <
>> > >>>> [email protected]> wrote:
>> > >>>>
>> > >>>>> Thanks Steven,
>> > >>>>> +1 on this initiative, I am also interested to contribute in this
>> > >>>>> area.
>> > >>>>> As you mentioned it has a quite a breadth, my though is we can
>> start a
>> > >>>>> document to  discuss different layers separately like type of
>> indexes, sync
>> > >>>>> vs async, spec changes, priority of the index to be supported
>> (instead of
>> > >>>>> targeting all in one go)
>> > >>>>>
>> > >>>>> Thanks,
>> > >>>>> Manish
>> > >>>>>
>> > >>>>> On Fri, Jul 18, 2025 at 10:41 PM Steven Wu <[email protected]>
>> > >>>>> wrote:
>> > >>>>>
>> > >>>>>> Vignesh, that is yet to be discussed. We haven't got to that
>> kind of
>> > >>>>>> detail yet.
>> > >>>>>>
>> > >>>>>> In some cases, the index files are expected to be added along
>> with
>> > >>>>>> the data files in the same commit. Maybe some cases (like
>> secondary index)
>> > >>>>>> would prefer async process.
>> > >>>>>>
>> > >>>>>> On Fri, Jul 18, 2025 at 4:11 PM Vignesh <[email protected]>
>> > >>>>>> wrote:
>> > >>>>>>
>> > >>>>>>> Are the index files for all kinds expected to be written and
>> added
>> > >>>>>>> along with data files or would it be an optional async step?
>> > >>>>>>>
>> > >>>>>>> On Fri, Jul 18, 2025, 5:09 AM Péter Váry <
>> > >>>>>>> [email protected]> wrote:
>> > >>>>>>>
>> > >>>>>>>> > *Primary Index*: Conventionally Primary Index - just means
>> what
>> > >>>>>>>> the Table's Primary storage layout/organization was. Given
>> that Iceberg
>> > >>>>>>>> supports Sort-order - if the Spec adds constraints to
>> derive/influence Sort
>> > >>>>>>>> order based on the Identifier columns - it satisfies the
>> Primary Index
>> > >>>>>>>> criteria.
>> > >>>>>>>>
>> > >>>>>>>> Here is my mental model:
>> > >>>>>>>> - Primary Key - the unique identifier for the rows
>> > >>>>>>>> - Primary Key index - database index constructed on the
>> Primary Key
>> > >>>>>>>> column
>> > >>>>>>>> - Iceberg sort order - performance optimization used to speed
>> up
>> > >>>>>>>> frequent, or costly queries.
>> > >>>>>>>>
>> > >>>>>>>> The Iceberg sort order is often defined above different columns
>> > >>>>>>>> than the Primary Key, so I would try to avoid mixing the two
>> concepts.
>> > >>>>>>>>
>> > >>>>>>>> > we found that an Iceberg Table based Store Secondary Index -
>> > >>>>>>>> provides the right balance between the ability to skip over
>> and load needed
>> > >>>>>>>> sections and yet provide the right performance benefits.
>> > >>>>>>>>
>> > >>>>>>>> Could you please elaborate on what "Iceberg Table based Store
>> > >>>>>>>> Secondary Index" means?
>> > >>>>>>>> Is this another Iceberg table with different columns and
>> different
>> > >>>>>>>> sort order?
>> > >>>>>>>>
>> > >>>>>>>> > they want it to be in an open format, so that it can be
>> shared
>> > >>>>>>>> with other engines!
>> > >>>>>>>>
>> > >>>>>>>> Wholeheartedly agreed!
>> > >>>>>>>>
>> > >>>>>>>> Thanks Steven for starting, and others for participating in the
>> > >>>>>>>> discussion!
>> > >>>>>>>> PEter
>> > >>>>>>>>
>> > >>>>>>>> Sreeram Garlapati <[email protected]> ezt írta (időpont:
>> > >>>>>>>> 2025. júl. 15., K, 22:12):
>> > >>>>>>>>
>> > >>>>>>>>> Thanks Steven for starting this.
>> > >>>>>>>>>
>> > >>>>>>>>> I am interested in the - Index'ing related conversations.
>> > >>>>>>>>>
>> > >>>>>>>>> Here are some preliminary thoughts:
>> > >>>>>>>>>
>> > >>>>>>>>>    1. *Primary Index*: Conventionally Primary Index - just
>> means
>> > >>>>>>>>>    what the Table's Primary storage layout/organization was.
>> Given that
>> > >>>>>>>>>    Iceberg supports Sort-order - if the Spec adds
>> constraints to
>> > >>>>>>>>>    derive/influence Sort order based on the Identifier
>> columns - it satisfies
>> > >>>>>>>>>    the Primary Index criteria.
>> > >>>>>>>>>    2. *Secondary Index*: Secondary Index storage calls for an
>> > >>>>>>>>>    efficient organization which can hold Secondary Keys
>> along with the
>> > >>>>>>>>>    Location of the Row and any included columns. The index
>> can be of many
>> > >>>>>>>>>    types, based on the Data. Iceberg tables are typically
>> v.v.large. Hence,
>> > >>>>>>>>>    these Indexes also tend to be very large. Based on our
>> past 1-2 years of
>> > >>>>>>>>>    work in this space, we found that an Iceberg Table based
>> Store Secondary
>> > >>>>>>>>>    Index - provides the right balance between the ability to
>> skip over and
>> > >>>>>>>>>    load needed sections and yet provide the right
>> performance benefits. This
>> > >>>>>>>>>    decision was also shaped by popular opinion from many of
>> our partners &
>> > >>>>>>>>>    customers - as the Index computation involves a lot of
>> computation, they
>> > >>>>>>>>>    want it to be in an open format, so that it can be shared
>> with other
>> > >>>>>>>>>    engines!
>> > >>>>>>>>>    3. *Others: Full Text Search Indexes and Vector Indexes*:
>> It
>> > >>>>>>>>>    is critical that we allow years of innovation in the
>> space of Full Text
>> > >>>>>>>>>    Search and Vector indexes, especially with the current
>> acceleration in AI
>> > >>>>>>>>>    adoption & the need it is driving on the Keyword and
>> Similarity Search
>> > >>>>>>>>>    space. Given that Iceberg tables are extremely large, it
>> is critical for us
>> > >>>>>>>>>    to provide a good story for Indexes that can be
>> incrementally updated /
>> > >>>>>>>>>    partially loaded into memory.
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> Looking forward to the discussions.
>> > >>>>>>>>>
>> > >>>>>>>>> Best,
>> > >>>>>>>>> Sreeram
>> > >>>>>>>>>
>> > >>>>>>>>> On Tue, Jul 15, 2025 at 9:33 AM Anurag Mantripragada
>> > >>>>>>>>> <[email protected]> wrote:
>> > >>>>>>>>>
>> > >>>>>>>>>> Thanks for starting this thread, Steven!
>> > >>>>>>>>>>
>> > >>>>>>>>>> I have been interested in secondary indexing in Iceberg.
>> There
>> > >>>>>>>>>> was an old proposal secondary indexing [1], we may need to
>> revist/redesign
>> > >>>>>>>>>> these structures. I agree this is a very broad topic and
>> having indexing
>> > >>>>>>>>>> structures general enough to support a wide range of
>> use-cases will be a
>> > >>>>>>>>>> key challenge.
>> > >>>>>>>>>>
>> > >>>>>>>>>> I would like to get involved any discussions related to
>> indexing.
>> > >>>>>>>>>>
>> > >>>>>>>>>> [1] -
>> > >>>>>>>>>>
>> https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0
>> > >>>>>>>>>>
>> > >>>>>>>>>>
>> > >>>>>>>>>> Thanks,
>> > >>>>>>>>>> Anurag Mantripragada
>> > >>>>>>>>>>
>> > >>>>>>>>>>
>> > >>>>>>>>>> On Jul 15, 2025, at 2:37 AM, Maximilian Michels <
>> [email protected]>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>
>> > >>>>>>>>>> Thanks Steven for the summary. It would be great to extend
>> the
>> > >>>>>>>>>> Iceberg spec with index files, such that they can be used
>> for the different
>> > >>>>>>>>>> use cases.
>> > >>>>>>>>>>
>> > >>>>>>>>>> For my understanding, let me further outline the different
>> types
>> > >>>>>>>>>> of use cases for index files:
>> > >>>>>>>>>>
>> > >>>>>>>>>> ---
>> > >>>>>>>>>> Topic 1: Accelerating the resolution of equality deletes
>> > >>>>>>>>>> ---
>> > >>>>>>>>>>
>> > >>>>>>>>>> In its current form, equality deletes make it impossible to
>> > >>>>>>>>>> achieve proper merge-on-read performance in streaming reads,
>> and they also
>> > >>>>>>>>>> add a significant performance overhead in batch pipelines.
>> > >>>>>>>>>>
>> > >>>>>>>>>> Approach (a):
>> > >>>>>>>>>>
>> https://docs.google.com/document/d/1Jz4Fjt-6jRmwqbgHX_u0ohuyTB9ytDzfslS7lYraIjk/
>> > >>>>>>>>>> Converting equality deletes to positional deletes would be a
>> > >>>>>>>>>> great achievement. I'm wondering though, if all engines will
>> be able to
>> > >>>>>>>>>> achieve this. There is quite some runtime complexity
>> involved to achieve
>> > >>>>>>>>>> this. If I understand correctly, the index can be
>> bootstrapped via table
>> > >>>>>>>>>> maintenance tasks, then has to be maintained by the
>> streaming writer.
>> > >>>>>>>>>>
>> > >>>>>>>>>> Approach (b):
>> > >>>>>>>>>>
>> https://lists.apache.org/thread/gjjr30txq318qp6pff3x5fx1jmdnr6fv
>> > >>>>>>>>>> This would boost the resolution of equality deletes during
>> reads
>> > >>>>>>>>>> via indices. The indices can be built via maintenance tasks,
>> or directly by
>> > >>>>>>>>>> the writer as in (a). But how to keep the index fresh if we
>> don't write the
>> > >>>>>>>>>> index at the writers? Readers won't always be able to use an
>> > >>>>>>>>>> up-to-date index, making this less suitable for streaming
>> reads.
>> > >>>>>>>>>>
>> > >>>>>>>>>> ---
>> > >>>>>>>>>> Topic 2: Full text search in table scans
>> > >>>>>>>>>> ---
>> > >>>>>>>>>>
>> > >>>>>>>>>>
>> > >>>>>>>>>>
>> https://docs.google.com/document/d/1bMACRCJBB8ycSXCFbP_BdCbFCAegRoxr2O2NXZirOmY/edit
>> > >>>>>>>>>> Adding full-text search would broaden Iceberg’s
>> applicability,
>> > >>>>>>>>>> enabling new search use cases and making table scans far
>> more powerful.
>> > >>>>>>>>>>
>> > >>>>>>>>>> Cheers,
>> > >>>>>>>>>> Max
>> > >>>>>>>>>>
>> > >>>>>>>>>> On Wed, Jul 9, 2025 at 11:35 PM Steven Wu <[email protected]>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Similar to other V4 threads, I am starting a thread to gauge
>> > >>>>>>>>>>> interest in adding index support in Iceberg V4 and gather a
>> focus group in
>> > >>>>>>>>>>> this area.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> There have been a few discussions related to indexing
>> recently.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>    - Me and Peter Vary are working on a proposal (WIP) to
>> > >>>>>>>>>>>    only write position deletes in the Flink streaming
>> writer. It would need a
>> > >>>>>>>>>>>    primary key index to work reasonably efficiently. [1]
>> > >>>>>>>>>>>    - Xiaoxuan Li has a proposal to leverage index files to
>> > >>>>>>>>>>>    improve merge-on-read performance with equality
>> deletes. [2]
>> > >>>>>>>>>>>    - pengzhiwei has a proposal to support full-text index
>> and
>> > >>>>>>>>>>>    vector index. [3]
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> *Idea: index files*
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> To support those use cases, Iceberg can add support for
>> index
>> > >>>>>>>>>>> files (in addition to data files and delete files). It
>> should be general
>> > >>>>>>>>>>> enough to support different forms of indexing.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>    - Primary key index
>> > >>>>>>>>>>>    - Secondary index
>> > >>>>>>>>>>>    - Full text index
>> > >>>>>>>>>>>    - Vector index
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> This email is a starting point. It is a large topic. A lot
>> of
>> > >>>>>>>>>>> discussions and maturation of the ideas are needed before a
>> formal proposal.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Thanks,
>> > >>>>>>>>>>> Steven
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> [1]
>> > >>>>>>>>>>>
>> https://docs.google.com/document/d/1Jz4Fjt-6jRmwqbgHX_u0ohuyTB9ytDzfslS7lYraIjk/
>> > >>>>>>>>>>> (WIP)
>> > >>>>>>>>>>> [2]
>> > >>>>>>>>>>>
>> https://lists.apache.org/thread/j4zl44g6dllzzyg9ln45pvgoosfhxqrq
>> > >>>>>>>>>>> [3] https://github.com/apache/iceberg/issues/12636
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>
>> > >>>
>> > >>> --
>> > >>> Xinli Shang
>> > >>>
>> > >>
>> >
>>
>
>

Re: [DISCUSS] V4 - indexing support

Reply via email to