Hi Huaxin, Peter, I will be happy to participate in these discussions as well. I think there is a potential to leverage some of the work done to support deletion vectors and use it for this use case of storing indexes. Specifically, we have in the manifest the *referenced_data_file *field which is currently used to reference the delete vectors but can also be used to reference the index file. Similarly the content_offset and content_size_in_bytes can be used to pack multiple indexes together. Maybe it will make sense to introduce an index manifest to track the indexes individually.
Thanks, Guy On 2025/12/05 10:26:03 Péter Váry wrote: > Hi Huaxin, > > Great to see your interest in indexing! > > From my perspective, there are two distinct levels of indexes: > > 1. *File-level indexes* – These help determine whether a file needs to > be read and, if so, which portions to scan. Such indexes can be embedded > within the data file or stored alongside it, and they can be computed > asynchronously and independently. > 2. *Table-level indexes* – These involve maintaining an auxiliary layout > for the entire table, which can deliver significant performance > improvements for query execution. > > If I understand correctly, the earlier proposal and Anirban Goswami's > document primarily focus on the first approach. In parallel, we are working > on a proposal that elaborates on the second. Both strategies can complement > each other and accelerate query performance. > Let’s collaborate to make Iceberg queries faster across all engines! > > Thanks, > Peter > > > huaxin gao <[email protected]> ezt írta (időpont: 2025. dec. 4., Cs, > 23:19): > > > Hi all, > > Miao and I plan to resume the secondary index work. The proposal > > < https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0> > > was written in 2020 and has been reviewed. We will bring it up to date. > > > > Thanks, > > Huaxin > > > > On Mon, Nov 24, 2025 at 7:23 AM Anirban Goswami <[email protected]> > > wrote: > > > >> Peter, > >> > >> That definitely a thing that dragging me awa from the database thing. It > >> is one of some thoughts. > >> > >> How can we get rid of more files that is my driver. Let’s discuss more. > >> > >> Ani > >> > >> On 24 Nov 2025, at 7:25 PM, Péter Váry <[email protected]> > >> wrote: > >> > >> Hi Anirban, > >> > >> I don't really like the dependency on the external database for the > >> index. Every reader should be able to access the database, and given a big > >> table with several readers, it could become a bottleneck. > >> > >> I can imagine something similar as part of a REST catalog where the > >> catalog is used for planning: > >> - The Catalog could decide to read and cache the metadata from the files > >> (the cache could be stored in a db, or rocksdb, or whatever) > >> - During the planning the Catalog could get the relevant rowgroups, and > >> combine back them to a smaller number of splits (if there are > >> continuous rowgroups, they could be combined) > >> - The users don't need to do anything else, just call the Catalog > >> planning API. > >> > >> In this way, we don't have to change the metadata to get the same gains. > >> > >> WDYT? > >> > >> Anirban Goswami <[email protected]> ezt írta (időpont: 2025. > >> nov. 18., K, 19:48): > >> > >>> Thanks Peter. > >>> > >>> I was also doing some analysis on how to get secondary index in iceberg > >>> as we are dealing with several usecases where the table is pretty big and > >>> partitions are on different keys. In case we try to query with other keys > >>> it is always difficult to get better responses, or say similar response > >>> that snowflake or similar system provides by some accelerations or say > >>> saerch optimisations methods. > >>> > >>> Already we have huge metadata load on us and if we try to add idnex as > >>> file system then it will be too much to process and maintan as well. I have > >>> created one doc with some thougts and want to udnerstand how u look at it. > >>> > >>> OLTP Database-Backed Index Architecture for Apache Iceberg > >>> < https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy> > >>> docs.google.com > >>> < https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy> > >>> > >>> < https://docs.google.com/document/d/15230FAEF3_8EEEniZ2c-S6I46dECDWAjDoInpNNHdiQ/edit?sharingaction=ownershiptransfer&pli=1&tab=t.0#heading=h.zcwgk1s56yiy> > >>> > >>> Regards, > >>> Ani > >>> > >>> > >>> On 2025/11/18 11:32:24 Péter Váry wrote: > >>> > Hi Team, > >>> > > >>> > Do we have any progress on this topic? I’d really like to see this move > >>> > forward. > >>> > > >>> > Following Sreeram’s suggestion, we should start collecting the key use > >>> > cases we want to support with indexes. Here’s what I’ve heard so far: > >>> > > >>> > - *Primary key index* > >>> > - Find a single or few rows by a given primary key > >>> > - Build the Flink “primary key → file_name, position” state by > >>> bulk > >>> > reading the primary key index > >>> > - *Secondary index* > >>> > - Range or min/max filtering on columns that are not part of the > >>> > primary key (primary sort order) > >>> > - *Full-text index* > >>> > - Term search in text columns > >>> > - *Vector index* > >>> > - Nearest or approximate nearest neighbor search > >>> > - *Geospatial index* > >>> > - Finding points within a polygon or nearest location > >>> > > >>> > We should identify a few critical use cases and keep the others in mind > >>> > when designing how we store, retrieve, and use these indexes. > >>> Personally, > >>> > I’d love to see *vector indexes in Iceberg*, enabling fast AI searches > >>> on > >>> > Iceberg tables. > >>> > > >>> > For reference, I asked Copilot to collect the currently available index > >>> > types in MSSQL, Oracle, Postgres, MySQL, and LanceDB. Here’s the list: > >>> > > >>> https://docs.google.com/spreadsheets/d/14cBdwsOw89ivolHtAw342YNoGmb1-Kri1E80hwWymL0Thanks > >>> > , > >>> > > >>> > Peter > >>> > > >>> > > >>> > Aihua Xu <[email protected]> ezt írta (időpont: 2025. nov. 2., V, 4:11): > >>> > > >>> > > Thanks Steven for raising this topic and giving a summary on the > >>> > > proposals. I would like to get involved in this area. > >>> > > > >>> > > On Fri, Oct 31, 2025 at 4:49 PM huaxin gao <[email protected]> wrote: > >>> > > > >>> > >> Thanks, Steven, for taking the initiative. I have previously > >>> collaborated > >>> > >> with Miao from Adobe on secondary index and would like to continue > >>> that > >>> > >> work. > >>> > >> > >>> > >> Huaxin > >>> > >> > >>> > >> On Fri, Oct 31, 2025 at 1:07 PM Xinli shang <[email protected] lid > >>> > > >>> > >> wrote: > >>> > >> > >>> > >>> Thanks Steven for proposing this! This is right direction to go. > >>> > >>> Definitely we see challenges in some cases without indexing > >>> support, > >>> > >>> especially around equality deletes and point lookups. I would like > >>> to > >>> > >>> contribute as well. One thing we need to be careful is that the > >>> overhead of > >>> > >>> the index itself like memory usage, index update etc. > >>> > >>> > >>> > >>> Namratha, for Parquet column index, we had one for Presto > >>> > >>> https://www.youtube.com/watch?v=fr_HdhMEa3s. > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> On Fri, Oct 31, 2025 at 11:48 AM namratha mk <[email protected]> > >>> wrote: > >>> > >>> > >>> > >>>> Hi, > >>> > >>>> > >>> > >>>> I see the point in the doc : > >>> > >>>> > >>> > >>>> *The primary key index can also be useful for point lookup.* > >>> > >>>> But to achieve the above we would need to store native file format > >>> > >>>> metadata like parquet page index > >>> > >>>> <https://parquet.apache.org/docs/file-format/pageindex/> in the > >>> > >>>> primary index which helps in fetching for lookup use case. Has > >>> there been > >>> > >>>> any talks in the community about this? Would like to get more > >>> opinions on > >>> > >>>> this. > >>> > >>>> > >>> > >>>> Thanks, > >>> > >>>> Namratha > >>> > >>>> > >>> > >>>> On Sat, Jul 19, 2025 at 2:39 AM Manish Malhotra < > >>> > >>>> [email protected]> wrote: > >>> > >>>> > >>> > >>>>> Thanks Steven, > >>> > >>>>> +1 on this initiative, I am also interested to contribute in this > >>> > >>>>> area. > >>> > >>>>> As you mentioned it has a quite a breadth, my though is we can > >>> start a > >>> > >>>>> document to discuss different layers separately like type of > >>> indexes, sync > >>> > >>>>> vs async, spec changes, priority of the index to be supported > >>> (instead of > >>> > >>>>> targeting all in one go) > >>> > >>>>> > >>> > >>>>> Thanks, > >>> > >>>>> Manish > >>> > >>>>> > >>> > >>>>> On Fri, Jul 18, 2025 at 10:41 PM Steven Wu <[email protected]> > >>> > >>>>> wrote: > >>> > >>>>> > >>> > >>>>>> Vignesh, that is yet to be discussed. We haven't got to that > >>> kind of > >>> > >>>>>> detail yet. > >>> > >>>>>> > >>> > >>>>>> In some cases, the index files are expected to be added along > >>> with > >>> > >>>>>> the data files in the same commit. Maybe some cases (like > >>> secondary index) > >>> > >>>>>> would prefer async process. > >>> > >>>>>> > >>> > >>>>>> On Fri, Jul 18, 2025 at 4:11 PM Vignesh <[email protected]> > >>> > >>>>>> wrote: > >>> > >>>>>> > >>> > >>>>>>> Are the index files for all kinds expected to be written and > >>> added > >>> > >>>>>>> along with data files or would it be an optional async step? > >>> > >>>>>>> > >>> > >>>>>>> On Fri, Jul 18, 2025, 5:09 AM Péter Váry < > >>> > >>>>>>> [email protected]> wrote: > >>> > >>>>>>> > >>> > >>>>>>>> > *Primary Index*: Conventionally Primary Index - just means > >>> what > >>> > >>>>>>>> the Table's Primary storage layout/organization was. Given > >>> that Iceberg > >>> > >>>>>>>> supports Sort-order - if the Spec adds constraints to > >>> derive/influence Sort > >>> > >>>>>>>> order based on the Identifier columns - it satisfies the > >>> Primary Index > >>> > >>>>>>>> criteria. > >>> > >>>>>>>> > >>> > >>>>>>>> Here is my mental model: > >>> > >>>>>>>> - Primary Key - the unique identifier for the rows > >>> > >>>>>>>> - Primary Key index - database index constructed on the > >>> Primary Key > >>> > >>>>>>>> column > >>> > >>>>>>>> - Iceberg sort order - performance optimization used to speed > >>> up > >>> > >>>>>>>> frequent, or costly queries. > >>> > >>>>>>>> > >>> > >>>>>>>> The Iceberg sort order is often defined above different > >>> columns > >>> > >>>>>>>> than the Primary Key, so I would try to avoid mixing the two > >>> concepts. > >>> > >>>>>>>> > >>> > >>>>>>>> > we found that an Iceberg Table based Store Secondary Index - > >>> > >>>>>>>> provides the right balance between the ability to skip over > >>> and load needed > >>> > >>>>>>>> sections and yet provide the right performance benefits. > >>> > >>>>>>>> > >>> > >>>>>>>> Could you please elaborate on what "Iceberg Table based Store > >>> > >>>>>>>> Secondary Index" means? > >>> > >>>>>>>> Is this another Iceberg table with different columns and > >>> different > >>> > >>>>>>>> sort order? > >>> > >>>>>>>> > >>> > >>>>>>>> > they want it to be in an open format, so that it can be > >>> shared > >>> > >>>>>>>> with other engines! > >>> > >>>>>>>> > >>> > >>>>>>>> Wholeheartedly agreed! > >>> > >>>>>>>> > >>> > >>>>>>>> Thanks Steven for starting, and others for participating in > >>> the > >>> > >>>>>>>> discussion! > >>> > >>>>>>>> PEter > >>> > >>>>>>>> > >>> > >>>>>>>> Sreeram Garlapati <[email protected]> ezt írta (időpont: > >>> > >>>>>>>> 2025. júl. 15., K, 22:12): > >>> > >>>>>>>> > >>> > >>>>>>>>> Thanks Steven for starting this. > >>> > >>>>>>>>> > >>> > >>>>>>>>> I am interested in the - Index'ing related conversations. > >>> > >>>>>>>>> > >>> > >>>>>>>>> Here are some preliminary thoughts: > >>> > >>>>>>>>> > >>> > >>>>>>>>> 1. *Primary Index*: Conventionally Primary Index - just > >>> means > >>> > >>>>>>>>> what the Table's Primary storage layout/organization > >>> was. Given that > >>> > >>>>>>>>> Iceberg supports Sort-order - if the Spec adds > >>> constraints to > >>> > >>>>>>>>> derive/influence Sort order based on the Identifier > >>> columns - it satisfies > >>> > >>>>>>>>> the Primary Index criteria. > >>> > >>>>>>>>> 2. *Secondary Index*: Secondary Index storage calls for > >>> an > >>> > >>>>>>>>> efficient organization which can hold Secondary Keys > >>> along with the > >>> > >>>>>>>>> Location of the Row and any included columns. The index > >>> can be of many > >>> > >>>>>>>>> types, based on the Data. Iceberg tables are typically > >>> v.v.large. Hence, > >>> > >>>>>>>>> these Indexes also tend to be very large. Based on our > >>> past 1-2 years of > >>> > >>>>>>>>> work in this space, we found that an Iceberg Table based > >>> Store Secondary > >>> > >>>>>>>>> Index - provides the right balance between the ability > >>> to skip over and > >>> > >>>>>>>>> load needed sections and yet provide the right > >>> performance benefits. This > >>> > >>>>>>>>> decision was also shaped by popular opinion from many of > >>> our partners & > >>> > >>>>>>>>> customers - as the Index computation involves a lot of > >>> computation, they > >>> > >>>>>>>>> want it to be in an open format, so that it can be > >>> shared with other > >>> > >>>>>>>>> engines! > >>> > >>>>>>>>> 3. *Others: Full Text Search Indexes and Vector > >>> Indexes*: It > >>> > >>>>>>>>> is critical that we allow years of innovation in the > >>> space of Full Text > >>> > >>>>>>>>> Search and Vector indexes, especially with the current > >>> acceleration in AI > >>> > >>>>>>>>> adoption & the need it is driving on the Keyword and > >>> Similarity Search > >>> > >>>>>>>>> space. Given that Iceberg tables are extremely large, it > >>> is critical for us > >>> > >>>>>>>>> to provide a good story for Indexes that can be > >>> incrementally updated / > >>> > >>>>>>>>> partially loaded into memory. > >>> > >>>>>>>>> > >>> > >>>>>>>>> > >>> > >>>>>>>>> Looking forward to the discussions. > >>> > >>>>>>>>> > >>> > >>>>>>>>> Best, > >>> > >>>>>>>>> Sreeram > >>> > >>>>>>>>> > >>> > >>>>>>>>> On Tue, Jul 15, 2025 at 9:33 AM Anurag Mantripragada > >>> > >>>>>>>>> <[email protected]> wrote: > >>> > >>>>>>>>> > >>> > >>>>>>>>>> Thanks for starting this thread, Steven! > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> I have been interested in secondary indexing in Iceberg. > >>> There > >>> > >>>>>>>>>> was an old proposal secondary indexing [1], we may need to > >>> revist/redesign > >>> > >>>>>>>>>> these structures. I agree this is a very broad topic and > >>> having indexing > >>> > >>>>>>>>>> structures general enough to support a wide range of > >>> use-cases will be a > >>> > >>>>>>>>>> key challenge. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> I would like to get involved any discussions related to > >>> indexing. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> [1] - > >>> > >>>>>>>>>> > >>> https://docs.google.com/document/d/1E1ofBQoKRnX04bWT3utgyHQGaHZoelgXosk_UNsTUuQ/edit?tab=t.0 > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Thanks, > >>> > >>>>>>>>>> Anurag Mantripragada > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> On Jul 15, 2025, at 2:37 AM, Maximilian Michels < > >>> [email protected]> > >>> > >>>>>>>>>> wrote: > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Thanks Steven for the summary. It would be great to extend > >>> the > >>> > >>>>>>>>>> Iceberg spec with index files, such that they can be used > >>> for the different > >>> > >>>>>>>>>> use cases. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> For my understanding, let me further outline the different > >>> types > >>> > >>>>>>>>>> of use cases for index files: > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> --- > >>> > >>>>>>>>>> Topic 1: Accelerating the resolution of equality deletes > >>> > >>>>>>>>>> --- > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> In its current form, equality deletes make it impossible to > >>> > >>>>>>>>>> achieve proper merge-on-read performance in streaming > >>> reads, and they also > >>> > >>>>>>>>>> add a significant performance overhead in batch pipelines. > >>> > >>>>>>>>>> > >>> > >>>>>>>>>> Approach (a): > >>> > >>>>>>>>>> > >>> https://docs.google.com/document/d/1Jz4Fjt-6jRmwqbgHX_u0ohuyTB9ytDzfslS7lYraIjk/ > >>> > >>>>>>>>>> Converti [message truncated...]
