Re: [DISCUSS] V4 - indexing support

Guy Khazma Mon, 08 Dec 2025 07:53:43 -0800

Hi Huaxin, Peter,

I will be happy to participate in these discussions as well.
I think there is a potential to leverage some of the work done to support
deletion vectors and use it for this use case of storing indexes.
Specifically, we have in the manifest the *referenced_data_file *field
which is currently used to reference the delete vectors but can also be
used to reference the index file.
Similarly the content_offset and content_size_in_bytes can be used to pack
multiple indexes together.
Maybe it will make sense to introduce an index manifest to track the
indexes individually.


Thanks,
Guy

On 2025/12/05 10:26:03 Péter Váry wrote:
> Hi Huaxin,
>
> Great to see your interest in indexing!
>
> From my perspective, there are two distinct levels of indexes:
>
>    1. *File-level indexes* – These help determine whether a file needs to
>    be read and, if so, which portions to scan. Such indexes can be
embedded
>    within the data file or stored alongside it, and they can be computed
>    asynchronously and independently.
>    2. *Table-level indexes* – These involve maintaining an auxiliary
layout
>    for the entire table, which can deliver significant performance
>    improvements for query execution.
>
> If I understand correctly, the earlier proposal and Anirban Goswami's
> document primarily focus on the first approach. In parallel, we are
working
> on a proposal that elaborates on the second. Both strategies can
complement
> each other and accelerate query performance.
> Let’s collaborate to make Iceberg queries faster across all engines!
>
> Thanks,
> Peter
>
>
> huaxin gao <[email protected]> ezt írta (időpont: 2025. dec. 4., Cs,
> 23:19):
>
> > Hi all,
> > Miao and I plan to resume the secondary index work. The proposal
> > <
https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0>
> > was written in 2020 and has been reviewed. We will bring it up to date.
> >
> > Thanks,
> > Huaxin
> >
> > On Mon, Nov 24, 2025 at 7:23 AM Anirban Goswami <[email protected]>
> > wrote:
> >
> >> Peter,
> >>
> >> That definitely a thing that dragging me awa from the database thing.
It
> >> is one of some thoughts.
> >>
> >> How can we get rid of more files that is my driver. Let’s discuss more.
> >>
> >> Ani
> >>
> >> On 24 Nov 2025, at 7:25 PM, Péter Váry <[email protected]>
> >> wrote:
> >>
> >> Hi Anirban,
> >>
> >> I don't really like the dependency on the external database for the
> >> index. Every reader should be able to access the database, and given a
big
> >> table with several readers, it could become a bottleneck.
> >>
> >> I can imagine something similar as part of a REST catalog where the
> >> catalog is used for planning:
> >> - The Catalog could decide to read and cache the metadata from the
files
> >> (the cache could be stored in a db, or rocksdb, or whatever)
> >> - During the planning the Catalog could get the relevant rowgroups, and
> >> combine back them to a smaller number of splits (if there are
> >> continuous rowgroups, they could be combined)
> >> - The users don't need to do anything else, just call the Catalog
> >> planning API.
> >>
> >> In this way, we don't have to change the metadata to get the same
gains.
> >>
> >> WDYT?
> >>
> >> Anirban Goswami <[email protected]> ezt írta (időpont: 2025.
> >> nov. 18., K, 19:48):
> >>
> >>> Thanks Peter.
> >>>
> >>> I was also doing some analysis on how to get secondary index in
iceberg
> >>> as we are dealing with several usecases where the table is pretty big
and
> >>> partitions are on different keys. In case we try to query with other
keys
> >>> it is always difficult to get better responses, or say similar
response
> >>> that snowflake or similar system provides by some accelerations or say
> >>> saerch optimisations methods.
> >>>
> >>> Already we have huge metadata load on us and if we try to add idnex as
> >>> file system then it will be too much to process and maintan as well.
I have
> >>> created one doc with some thougts and want to udnerstand how u look
at it.
> >>>
> >>> OLTP Database-Backed Index Architecture for Apache Iceberg
> >>> <
https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
> >>> docs.google.com
> >>> <
https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
> >>>
> >>> <
https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy>
> >>>
> >>> Regards,
> >>> Ani
> >>>
> >>>
> >>> On 2025/11/18 11:32:24 Péter Váry wrote:
> >>> > Hi Team,
> >>> >
> >>> > Do we have any progress on this topic? I’d really like to see this
move
> >>> > forward.
> >>> >
> >>> > Following Sreeram’s suggestion, we should start collecting the key
use
> >>> > cases we want to support with indexes. Here’s what I’ve heard so
far:
> >>> >
> >>> >    - *Primary key index*
> >>> >       - Find a single or few rows by a given primary key
> >>> >       - Build the Flink “primary key → file_name, position” state by
> >>> bulk
> >>> >       reading the primary key index
> >>> >    - *Secondary index*
> >>> >       - Range or min/max filtering on columns that are not part of
the
> >>> >       primary key (primary sort order)
> >>> >    - *Full-text index*
> >>> >       - Term search in text columns
> >>> >    - *Vector index*
> >>> >       - Nearest or approximate nearest neighbor search
> >>> >    - *Geospatial index*
> >>> >       - Finding points within a polygon or nearest location
> >>> >
> >>> > We should identify a few critical use cases and keep the others in
mind
> >>> > when designing how we store, retrieve, and use these indexes.
> >>> Personally,
> >>> > I’d love to see *vector indexes in Iceberg*, enabling fast AI
searches
> >>> on
> >>> > Iceberg tables.
> >>> >
> >>> > For reference, I asked Copilot to collect the currently available
index
> >>> > types in MSSQL, Oracle, Postgres, MySQL, and LanceDB. Here’s the
list:
> >>> >
> >>>
https://docs.google.com/spreadsheets/d/14cBdwsOw89ivolHtAw342YNoGmb1-Kri1E80hwWymL0Thanks
> >>> > ,
> >>> >
> >>> > Peter
> >>> >
> >>> >
> >>> > Aihua Xu <[email protected]> ezt írta (időpont: 2025. nov. 2., V,
4:11):
> >>> >
> >>> > > Thanks Steven for raising this topic and giving a summary on the
> >>> > > proposals. I would like to get involved in this area.
> >>> > >
> >>> > > On Fri, Oct 31, 2025 at 4:49 PM huaxin gao <[email protected]>
wrote:
> >>> > >
> >>> > >> Thanks, Steven, for taking the initiative. I have previously
> >>> collaborated
> >>> > >> with Miao from Adobe on secondary index and would like to
continue
> >>> that
> >>> > >> work.
> >>> > >>
> >>> > >> Huaxin
> >>> > >>
> >>> > >> On Fri, Oct 31, 2025 at 1:07 PM Xinli shang <[email protected]
lid
> >>> >
> >>> > >> wrote:
> >>> > >>
> >>> > >>> Thanks Steven for proposing this! This is right direction to go.
> >>> > >>> Definitely we see challenges in some cases without indexing
> >>> support,
> >>> > >>> especially around equality deletes and point lookups. I would
like
> >>> to
> >>> > >>> contribute as well. One thing we need to be careful is that the
> >>> overhead of
> >>> > >>> the index itself like memory usage, index update etc.
> >>> > >>>
> >>> > >>> Namratha, for Parquet column index, we had one for Presto
> >>> > >>> https://www.youtube.com/watch?v=fr_HdhMEa3s.
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>>
> >>> > >>> On Fri, Oct 31, 2025 at 11:48 AM namratha mk <[email protected]>
> >>> wrote:
> >>> > >>>
> >>> > >>>> Hi,
> >>> > >>>>
> >>> > >>>> I see the point in the doc :
> >>> > >>>>
> >>> > >>>> *The primary key index can also be useful for point lookup.*
> >>> > >>>> But to achieve the above we would need to store native file
format
> >>> > >>>> metadata like parquet page index
> >>> > >>>> <https://parquet.apache.org/docs/file-format/pageindex/> in the
> >>> > >>>> primary index which helps in fetching for lookup use case. Has
> >>> there been
> >>> > >>>> any talks in the community about this? Would like to get more
> >>> opinions on
> >>> > >>>> this.
> >>> > >>>>
> >>> > >>>> Thanks,
> >>> > >>>> Namratha
> >>> > >>>>
> >>> > >>>> On Sat, Jul 19, 2025 at 2:39 AM Manish Malhotra <
> >>> > >>>> [email protected]> wrote:
> >>> > >>>>
> >>> > >>>>> Thanks Steven,
> >>> > >>>>> +1 on this initiative, I am also interested to contribute in
this
> >>> > >>>>> area.
> >>> > >>>>> As you mentioned it has a quite a breadth, my though is we can
> >>> start a
> >>> > >>>>> document to  discuss different layers separately like type of
> >>> indexes, sync
> >>> > >>>>> vs async, spec changes, priority of the index to be supported
> >>> (instead of
> >>> > >>>>> targeting all in one go)
> >>> > >>>>>
> >>> > >>>>> Thanks,
> >>> > >>>>> Manish
> >>> > >>>>>
> >>> > >>>>> On Fri, Jul 18, 2025 at 10:41 PM Steven Wu <[email protected]>
> >>> > >>>>> wrote:
> >>> > >>>>>
> >>> > >>>>>> Vignesh, that is yet to be discussed. We haven't got to that
> >>> kind of
> >>> > >>>>>> detail yet.
> >>> > >>>>>>
> >>> > >>>>>> In some cases, the index files are expected to be added along
> >>> with
> >>> > >>>>>> the data files in the same commit. Maybe some cases (like
> >>> secondary index)
> >>> > >>>>>> would prefer async process.
> >>> > >>>>>>
> >>> > >>>>>> On Fri, Jul 18, 2025 at 4:11 PM Vignesh <[email protected]>
> >>> > >>>>>> wrote:
> >>> > >>>>>>
> >>> > >>>>>>> Are the index files for all kinds expected to be written and
> >>> added
> >>> > >>>>>>> along with data files or would it be an optional async step?
> >>> > >>>>>>>
> >>> > >>>>>>> On Fri, Jul 18, 2025, 5:09 AM Péter Váry <
> >>> > >>>>>>> [email protected]> wrote:
> >>> > >>>>>>>
> >>> > >>>>>>>> > *Primary Index*: Conventionally Primary Index - just
means
> >>> what
> >>> > >>>>>>>> the Table's Primary storage layout/organization was. Given
> >>> that Iceberg
> >>> > >>>>>>>> supports Sort-order - if the Spec adds constraints to
> >>> derive/influence Sort
> >>> > >>>>>>>> order based on the Identifier columns - it satisfies the
> >>> Primary Index
> >>> > >>>>>>>> criteria.
> >>> > >>>>>>>>
> >>> > >>>>>>>> Here is my mental model:
> >>> > >>>>>>>> - Primary Key - the unique identifier for the rows
> >>> > >>>>>>>> - Primary Key index - database index constructed on the
> >>> Primary Key
> >>> > >>>>>>>> column
> >>> > >>>>>>>> - Iceberg sort order - performance optimization used to
speed
> >>> up
> >>> > >>>>>>>> frequent, or costly queries.
> >>> > >>>>>>>>
> >>> > >>>>>>>> The Iceberg sort order is often defined above different
> >>> columns
> >>> > >>>>>>>> than the Primary Key, so I would try to avoid mixing the
two
> >>> concepts.
> >>> > >>>>>>>>
> >>> > >>>>>>>> > we found that an Iceberg Table based Store Secondary
Index -
> >>> > >>>>>>>> provides the right balance between the ability to skip over
> >>> and load needed
> >>> > >>>>>>>> sections and yet provide the right performance benefits.
> >>> > >>>>>>>>
> >>> > >>>>>>>> Could you please elaborate on what "Iceberg Table based
Store
> >>> > >>>>>>>> Secondary Index" means?
> >>> > >>>>>>>> Is this another Iceberg table with different columns and
> >>> different
> >>> > >>>>>>>> sort order?
> >>> > >>>>>>>>
> >>> > >>>>>>>> > they want it to be in an open format, so that it can be
> >>> shared
> >>> > >>>>>>>> with other engines!
> >>> > >>>>>>>>
> >>> > >>>>>>>> Wholeheartedly agreed!
> >>> > >>>>>>>>
> >>> > >>>>>>>> Thanks Steven for starting, and others for participating in
> >>> the
> >>> > >>>>>>>> discussion!
> >>> > >>>>>>>> PEter
> >>> > >>>>>>>>
> >>> > >>>>>>>> Sreeram Garlapati <[email protected]> ezt írta (időpont:
> >>> > >>>>>>>> 2025. júl. 15., K, 22:12):
> >>> > >>>>>>>>
> >>> > >>>>>>>>> Thanks Steven for starting this.
> >>> > >>>>>>>>>
> >>> > >>>>>>>>> I am interested in the - Index'ing related conversations.
> >>> > >>>>>>>>>
> >>> > >>>>>>>>> Here are some preliminary thoughts:
> >>> > >>>>>>>>>
> >>> > >>>>>>>>>    1. *Primary Index*: Conventionally Primary Index - just
> >>> means
> >>> > >>>>>>>>>    what the Table's Primary storage layout/organization
> >>> was. Given that
> >>> > >>>>>>>>>    Iceberg supports Sort-order - if the Spec adds
> >>> constraints to
> >>> > >>>>>>>>>    derive/influence Sort order based on the Identifier
> >>> columns - it satisfies
> >>> > >>>>>>>>>    the Primary Index criteria.
> >>> > >>>>>>>>>    2. *Secondary Index*: Secondary Index storage calls for
> >>> an
> >>> > >>>>>>>>>    efficient organization which can hold Secondary Keys
> >>> along with the
> >>> > >>>>>>>>>    Location of the Row and any included columns. The index
> >>> can be of many
> >>> > >>>>>>>>>    types, based on the Data. Iceberg tables are typically
> >>> v.v.large. Hence,
> >>> > >>>>>>>>>    these Indexes also tend to be very large. Based on our
> >>> past 1-2 years of
> >>> > >>>>>>>>>    work in this space, we found that an Iceberg Table
based
> >>> Store Secondary
> >>> > >>>>>>>>>    Index - provides the right balance between the ability
> >>> to skip over and
> >>> > >>>>>>>>>    load needed sections and yet provide the right
> >>> performance benefits. This
> >>> > >>>>>>>>>    decision was also shaped by popular opinion from many
of
> >>> our partners &
> >>> > >>>>>>>>>    customers - as the Index computation involves a lot of
> >>> computation, they
> >>> > >>>>>>>>>    want it to be in an open format, so that it can be
> >>> shared with other
> >>> > >>>>>>>>>    engines!
> >>> > >>>>>>>>>    3. *Others: Full Text Search Indexes and Vector
> >>> Indexes*: It
> >>> > >>>>>>>>>    is critical that we allow years of innovation in the
> >>> space of Full Text
> >>> > >>>>>>>>>    Search and Vector indexes, especially with the current
> >>> acceleration in AI
> >>> > >>>>>>>>>    adoption & the need it is driving on the Keyword and
> >>> Similarity Search
> >>> > >>>>>>>>>    space. Given that Iceberg tables are extremely large,
it
> >>> is critical for us
> >>> > >>>>>>>>>    to provide a good story for Indexes that can be
> >>> incrementally updated /
> >>> > >>>>>>>>>    partially loaded into memory.
> >>> > >>>>>>>>>
> >>> > >>>>>>>>>
> >>> > >>>>>>>>> Looking forward to the discussions.
> >>> > >>>>>>>>>
> >>> > >>>>>>>>> Best,
> >>> > >>>>>>>>> Sreeram
> >>> > >>>>>>>>>
> >>> > >>>>>>>>> On Tue, Jul 15, 2025 at 9:33 AM Anurag Mantripragada
> >>> > >>>>>>>>> <[email protected]> wrote:
> >>> > >>>>>>>>>
> >>> > >>>>>>>>>> Thanks for starting this thread, Steven!
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> I have been interested in secondary indexing in Iceberg.
> >>> There
> >>> > >>>>>>>>>> was an old proposal secondary indexing [1], we may need
to
> >>> revist/redesign
> >>> > >>>>>>>>>> these structures. I agree this is a very broad topic and
> >>> having indexing
> >>> > >>>>>>>>>> structures general enough to support a wide range of
> >>> use-cases will be a
> >>> > >>>>>>>>>> key challenge.
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> I would like to get involved any discussions related to
> >>> indexing.
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> [1] -
> >>> > >>>>>>>>>>
> >>>
https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> Thanks,
> >>> > >>>>>>>>>> Anurag Mantripragada
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> On Jul 15, 2025, at 2:37 AM, Maximilian Michels <
> >>> [email protected]>
> >>> > >>>>>>>>>> wrote:
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> Thanks Steven for the summary. It would be great to
extend
> >>> the
> >>> > >>>>>>>>>> Iceberg spec with index files, such that they can be used
> >>> for the different
> >>> > >>>>>>>>>> use cases.
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> For my understanding, let me further outline the
different
> >>> types
> >>> > >>>>>>>>>> of use cases for index files:
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> ---
> >>> > >>>>>>>>>> Topic 1: Accelerating the resolution of equality deletes
> >>> > >>>>>>>>>> ---
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> In its current form, equality deletes make it impossible
to
> >>> > >>>>>>>>>> achieve proper merge-on-read performance in streaming
> >>> reads, and they also
> >>> > >>>>>>>>>> add a significant performance overhead in batch
pipelines.
> >>> > >>>>>>>>>>
> >>> > >>>>>>>>>> Approach (a):
> >>> > >>>>>>>>>>
> >>>
https://docs.google.com/document/d/1Jz4Fjt-6jRmwqbgHX_u0ohuyTB9ytDzfslS7lYraIjk/
> >>> > >>>>>>>>>> Converti
[message truncated...]

Re: [DISCUSS] V4 - indexing support

Reply via email to