During the PR review [1], we began exploring what could we use as an intermediate layer to reduce the need for engines and file formats to implement the full matrix of file format - object model conversions.
To support this discussion, I’ve created and run a set of performance benchmarks and compiled a document outlining the potential benefits and trade-offs [2]. Feedback is welcome, feel free to comment on the document, the PR, or directly in this thread. Thanks, Peter [1] - PR discussion - https://github.com/apache/iceberg/pull/12774#discussion_r2093626096 [2] - File Format and engine object model transformation performance - https://docs.google.com/document/d/1GdA8IowKMtS3QVdm8s-0X-ZRYetcHv2bhQ9mrSd3fd4 Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. máj. 7., Sze, 13:15): > Hi everyone, > The proposed API part is reviewed and ready to go. See: > https://github.com/apache/iceberg/pull/12774 > Thanks to everyone who reviewed it already! > > Many of you wanted to review, but I know that the time constraints are > there for everyone. I still very much would like to hear your voices, so I > will not merge the PR this week. Please review it if you. > > Thanks, > Peter > > Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. ápr. > 16., Sze, 7:02): > >> Hi Renjie, >> The first one for the proposed new API is here: >> https://github.com/apache/iceberg/pull/12774 >> Thanks, Peter >> >> On Wed, Apr 16, 2025, 05:40 Renjie Liu <liurenjie2...@gmail.com> wrote: >> >>> Hi, Peter: >>> >>> Thanks for the effort. I totally agree with splitting them into smaller >>> prs to move forward. >>> >>> I'm quite interested in this topic, and please ping me in those splitted >>> prs and I'll help to review. >>> >>> On Mon, Apr 14, 2025 at 11:22 PM Jean-Baptiste Onofré <j...@nanthrax.net> >>> wrote: >>> >>>> Hi Peter >>>> >>>> Awesome ! Thank you so much ! >>>> I will do a new pass. >>>> >>>> Regards >>>> JB >>>> >>>> On Fri, Apr 11, 2025 at 3:48 PM Péter Váry <peter.vary.apa...@gmail.com> >>>> wrote: >>>> > >>>> > Hi JB, >>>> > >>>> > Separated out the proposed interfaces to a new PR: >>>> https://github.com/apache/iceberg/pull/12774. >>>> > Reviewers can check that out if they are only interested in how the >>>> new API would look like. >>>> > >>>> > Thanks, >>>> > Peter >>>> > >>>> > Jean-Baptiste Onofré <j...@nanthrax.net> ezt írta (időpont: 2025. ápr. >>>> 10., Cs, 18:25): >>>> >> >>>> >> Hi Peter >>>> >> >>>> >> Thanks for the ping about the PR. >>>> >> >>>> >> Maybe, to facilitate the review and move forward faster, we should >>>> >> split the PR in smaller PRs: >>>> >> - one with the interfaces (ReadBuilder, AppenderBuilder, ObjectModel, >>>> >> AppenderBuilder, DataWriterBuilder, ...) >>>> >> - one for each file providers (Parquet, Avro, ORC) >>>> >> >>>> >> Thoughts ? I can help on the split if needed. >>>> >> >>>> >> Regards >>>> >> JB >>>> >> >>>> >> On Thu, Apr 10, 2025 at 5:16 AM Péter Váry < >>>> peter.vary.apa...@gmail.com> wrote: >>>> >> > >>>> >> > Since the 1.9.0 release candidate has been created, I would like >>>> to resurrect this PR: https://github.com/apache/iceberg/pull/12298 to >>>> ensure that we have as long a testing period as possible for it. >>>> >> > >>>> >> > To recap, here is what the PR does after the review rounds: >>>> >> > >>>> >> > Created 3 interface classes which are implemented by the file >>>> formats: >>>> >> > >>>> >> > ReadBuilder - Builder for reading data from data files >>>> >> > AppenderBuilder - Builder for writing data to data files >>>> >> > ObjectModel - Providing ReadBuilders, and AppenderBuilders for the >>>> specific data file format and object model pair >>>> >> > >>>> >> > Updated the Parquet, Avro, ORC implementation for this interfaces, >>>> and deprecated the old reader/writer APIs >>>> >> > Created interface classes which will be used by the actual >>>> readers/writers of the data files: >>>> >> > >>>> >> > AppenderBuilder - Builder for writing a file >>>> >> > DataWriterBuilder - Builder for generating a data file >>>> >> > PositionDeleteWriterBuilder - Builder for generating a position >>>> delete file >>>> >> > EqualityDeleteWriterBuilder - Builder for generating an equality >>>> delete file >>>> >> > No ReadBuilder here - the file format reader builder is reused >>>> >> > >>>> >> > Created a WriterBuilder class which implements the interfaces >>>> above >>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder) >>>> based on a provided file format specific AppenderBuilder >>>> >> > Created an ObjectModelRegistry which stores the available >>>> ObjectModels, and engines and users could request the readers (ReadBuilder) >>>> and writers >>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder) >>>> from. >>>> >> > Created the appropriate ObjectModels: >>>> >> > >>>> >> > GenericObjectModels - for reading and writing Iceberg Records >>>> >> > SparkObjectModels - for reading (vectorized and non-vectorized) >>>> and writing Spark InternalRow/ColumnarBatch objects >>>> >> > FlinkObjectModels - for reading and writing Flink RowData objects >>>> >> > An arrow object model is also registered for vectorized reads of >>>> Parquet files into Arrow ColumnarBatch objects >>>> >> > >>>> >> > Updated the production code where the reading and writing happens >>>> to use the ObjectModelRegistry and the new reader/writer interfaces to >>>> access data files >>>> >> > Kept the testing code intact to ensure that the new API/code is >>>> not breaking anything >>>> >> > >>>> >> > The original change was not small, and grew substantially during >>>> the review rounds. So if you have questions, or I can do anything to make >>>> the review easier, don't hesitate to ask. I am happy to do anything to move >>>> this forward. >>>> >> > >>>> >> > Thanks, >>>> >> > Peter >>>> >> > >>>> >> > Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. >>>> márc. 26., Sze, 14:54): >>>> >> >> >>>> >> >> Hi everyone, >>>> >> >> >>>> >> >> I have updated the File Format API PR ( >>>> https://github.com/apache/iceberg/pull/12298) based on the answers and >>>> review comments. >>>> >> >> >>>> >> >> I would like to merge this only after the 1.9.0 release so we >>>> have more time finding any issues and solving them before this goes to a >>>> release for the users. >>>> >> >> >>>> >> >> For this I have updated the deprecation comments accordingly. >>>> >> >> I would like to ask you to review the PR, so we iron out any >>>> possible requested changes and be ready for the merge as soon as possible >>>> after the 1.9.0 release. >>>> >> >> >>>> >> >> Thanks, >>>> >> >> Peter >>>> >> >> >>>> >> >> Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: >>>> 2025. márc. 21., P, 14:32): >>>> >> >>> >>>> >> >>> Hi Renije, >>>> >> >>> >>>> >> >>> > 1. File format filters >>>> >> >>> > >>>> >> >>> > Do the filters include both filter expressions from both user >>>> query and delete filter? >>>> >> >>> >>>> >> >>> The current discussion is about the filters from the user query. >>>> >> >>> >>>> >> >>> About the delete filter: >>>> >> >>> Based on the suggestions on the PR, I have moved the delete >>>> filter out from the main API. Created a `SupportsDeleteFilter` interface >>>> for it which would allow pushing down to the filter to Parquet vectorized >>>> readers in Spark, as this is the only place where we currently implemented >>>> this feature. >>>> >> >>> >>>> >> >>> >>>> >> >>> Renjie Liu <liurenjie2...@gmail.com> ezt írta (időpont: 2025. >>>> márc. 21., P, 14:11): >>>> >> >>>> >>>> >> >>>> Hi, Peter: >>>> >> >>>> >>>> >> >>>> Thanks for the effort on this. >>>> >> >>>> >>>> >> >>>> 1. File format filters >>>> >> >>>> >>>> >> >>>> Do the filters include both filter expressions from both user >>>> query and delete filter? >>>> >> >>>> >>>> >> >>>> For filters from user query, I agree with you that we should >>>> keep the current behavior. >>>> >> >>>> >>>> >> >>>> For delete filters associated with data files, at first I >>>> thought file format readers should not care about this. But now I realized >>>> that maybe we need to also push it to file reader, this is useful when >>>> `IS_DELETED` metadata column is not necessary and we could use these >>>> filters (position deletes, etc) to further prune data. >>>> >> >>>> >>>> >> >>>> But anyway, I agree that we could postpone it in follow up pr. >>>> >> >>>> >>>> >> >>>> 2. Batch size configuration >>>> >> >>>> >>>> >> >>>> I'm leaning toward option 2. >>>> >> >>>> >>>> >> >>>> 3. Spark configuration >>>> >> >>>> >>>> >> >>>> I'm leaning towards using different configuration objects. >>>> >> >>>> >>>> >> >>>> >>>> >> >>>> >>>> >> >>>> On Thu, Mar 20, 2025 at 10:23 PM Péter Váry < >>>> peter.vary.apa...@gmail.com> wrote: >>>> >> >>>>> >>>> >> >>>>> Hi Team, >>>> >> >>>>> Thanks everyone for the reviews on >>>> https://github.com/apache/iceberg/pull/12298! >>>> >> >>>>> I have addressed most of comments, but a few questions still >>>> remain which might merit a bit wider audience: >>>> >> >>>>> >>>> >> >>>>> We should decide on the expected filtering behavior when the >>>> filters are pushed down to the readers. Currently the filters are applied >>>> as best effort for the file format readers. Some readers (Avro) just skip >>>> them altogether. There was a suggestion on the PR that we might enforce >>>> more strict requirements and the readers either reject part of the filters, >>>> or they could apply them fully. >>>> >> >>>>> Batch sizes are currently parameters for the reader builders >>>> which could be set for non-vectorized readers too which could be confusing. >>>> >> >>>>> Currently the Spark batch reader uses different configuration >>>> objects for ParquetBatchReadConf and OrcBatchReadConf as requested by the >>>> reviewers of the Comet PR. There was a suggestion on the current PR to use >>>> a common configuration instead. >>>> >> >>>>> >>>> >> >>>>> I would be interested in hearing your thoughts about these >>>> topics. >>>> >> >>>>> >>>> >> >>>>> My current take: >>>> >> >>>>> >>>> >> >>>>> File format filters: I am leaning towards keeping the current >>>> laninet behavior. Especially since Bloom filters are not able to do a full >>>> filtering, and are often used as a way to filter out unwanted records. >>>> Another option would be to implement a secondary filtering inside the file >>>> formats themselves which I think would cause extra complexity, and possible >>>> code duplication. Whatever the decision here, I would suggest moving this >>>> out to a next PR as the current changeset is big enough as it is. >>>> >> >>>>> Batch size configuration: Currently this is the only property >>>> which is different in the batch readers and the non-vectorized readers. I >>>> see 3 possible solutions: >>>> >> >>>>> >>>> >> >>>>> Create different builders for vectorized and non-vectorized >>>> reads - I don't think the current solution is confusing enough to worth the >>>> extra class >>>> >> >>>>> We could put this to the reader configuration property set - >>>> This could work, but "hide" the possible configuration mode which is valid >>>> for both Parquet and ORC readers >>>> >> >>>>> We could keep things as it is now - I would chose this one, >>>> but I don't have a strong opinion here >>>> >> >>>>> >>>> >> >>>>> Spark configuration: TBH, I'm open to bot solution and happy >>>> to move to the direction the community decides on >>>> >> >>>>> >>>> >> >>>>> Thanks, >>>> >> >>>>> Peter >>>> >> >>>>> >>>> >> >>>>> Jean-Baptiste Onofré <j...@nanthrax.net> ezt írta (időpont: >>>> 2025. márc. 14., P, 16:31): >>>> >> >>>>>> >>>> >> >>>>>> Hi Peter >>>> >> >>>>>> >>>> >> >>>>>> Thanks for the update. I will do a new pass on the PR. >>>> >> >>>>>> >>>> >> >>>>>> Regards >>>> >> >>>>>> JB >>>> >> >>>>>> >>>> >> >>>>>> On Thu, Mar 13, 2025 at 1:16 PM Péter Váry < >>>> peter.vary.apa...@gmail.com> wrote: >>>> >> >>>>>> > >>>> >> >>>>>> > Hi Team, >>>> >> >>>>>> > I have rebased the File Format API proposal ( >>>> https://github.com/apache/iceberg/pull/12298) to include the new >>>> changes needed for the Variant types. I would love to hear your feedback, >>>> especially Dan and Ryan, as you were the most active during our >>>> discussions. If I can help in any way to make the review easier, please let >>>> me know. >>>> >> >>>>>> > Thanks, >>>> >> >>>>>> > Peter >>>> >> >>>>>> > >>>> >> >>>>>> > Péter Váry <peter.vary.apa...@gmail.com> ezt írta >>>> (időpont: 2025. febr. 28., P, 17:50): >>>> >> >>>>>> >> >>>> >> >>>>>> >> Hi everyone, >>>> >> >>>>>> >> Thanks for all of the actionable, relevant feedback on the >>>> PR (https://github.com/apache/iceberg/pull/12298). >>>> >> >>>>>> >> Updated the code to address most of them. Please check if >>>> you agree with the general approach. >>>> >> >>>>>> >> If there is a consensus about the general approach, I >>>> could. separate out the PR to smaller pieces so we can have an easier time >>>> to review and merge those step-by-step. >>>> >> >>>>>> >> Thanks, >>>> >> >>>>>> >> Peter >>>> >> >>>>>> >> >>>> >> >>>>>> >> Jean-Baptiste Onofré <j...@nanthrax.net> ezt írta (időpont: >>>> 2025. febr. 20., Cs, 14:14): >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> Hi Peter >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> sorry for the late reply on this. >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> I did a pass on the proposal, it's very interesting and >>>> well written. >>>> >> >>>>>> >>> I like the DataFile API and definitely worth to discuss >>>> all together. >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> Maybe we can schedule a specific meeting to discuss about >>>> DataFile API ? >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> Thoughts ? >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> Regards >>>> >> >>>>>> >>> JB >>>> >> >>>>>> >>> >>>> >> >>>>>> >>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry < >>>> peter.vary.apa...@gmail.com> wrote: >>>> >> >>>>>> >>> > >>>> >> >>>>>> >>> > Hi Team, >>>> >> >>>>>> >>> > >>>> >> >>>>>> >>> > As mentioned earlier on our Community Sync I am >>>> exploring the possibility to define a FileFormat API for accessing >>>> different file formats. I have put together a proposal based on my >>>> findings. >>>> >> >>>>>> >>> > >>>> >> >>>>>> >>> > ------------------- >>>> >> >>>>>> >>> > Iceberg currently supports 3 different file formats: >>>> Avro, Parquet, ORC. With the introduction of Iceberg V3 specification many >>>> new features are added to Iceberg. Some of these features like new column >>>> types, default values require changes at the file format level. The changes >>>> are added by individual developers with different focus on the different >>>> file formats. As a result not all of the features are available for every >>>> supported file format. >>>> >> >>>>>> >>> > Also there are emerging file formats like Vortex [1] or >>>> Lance [2] which either by specialization, or by applying newer research >>>> results could provide better alternatives for certain use-cases like random >>>> access for data, or storing ML models. >>>> >> >>>>>> >>> > ------------------- >>>> >> >>>>>> >>> > >>>> >> >>>>>> >>> > Please check the detailed proposal [3] and the google >>>> document [4], and comment there or reply on the dev list if you have any >>>> suggestions. >>>> >> >>>>>> >>> > >>>> >> >>>>>> >>> > Thanks, >>>> >> >>>>>> >>> > Peter >>>> >> >>>>>> >>> > >>>> >> >>>>>> >>> > [1] - https://github.com/spiraldb/vortex >>>> >> >>>>>> >>> > [2] - https://lancedb.github.io/lance/ >>>> >> >>>>>> >>> > [3] - https://github.com/apache/iceberg/issues/12225 >>>> >> >>>>>> >>> > [4] - >>>> https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds >>>> >> >>>>>> >>> > >>>> >>>