Re: [DISCUSS] FileFormat API proposal

Péter Váry Fri, 12 Sep 2025 06:02:43 -0700

Let me summarize the state a bit:

The FileFormat interface needs to expose two distinct methods:


   - WriteBuilder<InternalRow>
   - WriteBuilder<PositionDelete<InternalRow>>
      - After the PDWR deprecation this will be WriteBuilder<PositionDelete>
      - After V2 deprecation this will be not needed anymore

Based on the file format methods, the Registry must support four builder
types:

   - WriteBuilder<InternalRow>
   - DataWriteBuilder<InternalRow>
   - EqualityDeleteWriteBuilder<InternalRow>
   - PositionDeleteWriteBuilder<InternalRow>


*API Design Considerations*
There is an argument that the two WriteBuilder methods provided by
FileFormat are essentially the same, differing only in the writerFunction.
While this is technically correct for current implementations, I believe
the API should clearly distinguish between the two writer types to
highlight the differences.

*Discussed Approaches*

*0. Two Explicit Methods on FormatModel* (removed based on previous
comments, but I personally still prefer this)

*WriteBuilder<InternalRow> writeBuilder(OutputFile outputFile);*
*WriteBuilder<PositionDelete<InternalRow>>
positionDeleteWriteBuilder(OutputFile outputFile); *


Pros: Clear separation of responsibilities

*1. One Builder + One Converter*

*WriteBuilder<InternalRow> writeBuilder(OutputFile outputFile);*
*Function<PositionDelete<D>, D> positionDeleteConverter(Schema schema);*


Pros: Keeps the interface compact
Cons: Requires additional documentation and understanding why the
conversion logic is needed

*2. Single Method with Javadoc Clarification* (most similar to the current
approach)

*WriteBuilder writeBuilder(OutputFile outputFile); *


Pros: Minimalistic
Cons: Least explicit; relies entirely on documentation

*2/b. Single Builder with Type Parameter *(based on Russell's suggestion)

*WriteBuilder writeBuilder(OutputFile outputFile);*
*// Usage: builder.build(Class<D> inputType)*


Pros: Flexible
Cons: Relies on documentation to clarify the available input types

*Bonus*
Options 0 and 1 make it easier to phase out PositionDelete filtering once
V2 tables are deprecated.

Thanks,
Peter

Péter Váry <[email protected]> ezt írta (időpont: 2025. szept.
11., Cs, 18:36):

> > Wouldn't PositionDelete<InternalRow> also be an InternalRow in this
> example? I think that's what i'm confused about.
>
> With the *second approach*, the WriteBuilder doesn’t need to handle
> PositionDelete objects directly. The conversion layer takes care of that,
> so the WriteBuilder only needs to work with InternalRow.
>
> With the *first approach*, we shift that responsibility to the
> WriteBuilder, which then has to support both InternalRow and
> PositionDelete<InternalRow>.
>
> In both cases, the FormatModelRegistry API will still expose the more
> concrete types (PositionDelete / InternalRow). However, under the *first
> approach*, the lower-level API only needs to handle InternalRow,
> simplifying its interface.
> Thanks,
> Peter
>
> Russell Spitzer <[email protected]> ezt írta (időpont: 2025.
> szept. 11., Cs, 17:12):
>
>> Wouldn't PositionDelete<InternalRow> also be an InternalRow in this
>> example? I think that's what i'm confused about.
>>
>> On Thu, Sep 11, 2025 at 5:35 AM Péter Váry <[email protected]>
>> wrote:
>>
>>> Thanks, Russell, for taking a look at this!
>>>
>>> We need to expose four methods on the user-facing API (
>>> FormatModelRegistry):
>>>
>>>    1. *writeBuilder* – for writing arbitrary files without Iceberg
>>>    metadata. In the Iceberg codebase, this is exposed via 
>>> FlinkAppenderFactory and
>>>    the GenericAppenderFactory for creating FileAppender<RowData> and
>>>    FileAppender<Record> only.
>>>    2. *dataWriteBuilder* – for creating and collecting metadata for
>>>    Iceberg DataFiles.
>>>    3. *equalityDeleteWriteBuilder* – for creating and collecting
>>>    metadata for Iceberg EqualityDeleteFiles.
>>>    4. *positionDeleteWriteBuilder* – for creating and collecting
>>>    metadata for Iceberg PositionDeleteFiles.
>>>
>>> We’d like to implement all four using a single WriteBuilder created by
>>> the FormatModels.
>>>
>>> Your suggestion is a good one—it helps formalize the requirements for
>>> the build method and also surfaces an important design question:
>>>
>>> *Who should be responsible for handling the differences between normal
>>> rows (InternalRow) and position deletes (PositionDelete<InternalRow>)*?
>>>
>>>    - Should we have a more complex WriteBuilder class that can create
>>>    both DataFileAppender and PositionDeleteAppender?
>>>    - Or should we push this responsibility to the engine-specific code,
>>>    where we already have some logic (e.g., pathTransformFunc) needed by
>>>    each engine to create the PositionDeleteAppender?
>>>
>>> Thanks,
>>> Peter
>>>
>>>
>>> Russell Spitzer <[email protected]> ezt írta (időpont: 2025.
>>> szept. 11., Cs, 0:11):
>>>
>>>> I'm a little confused here, I think Ryan mentioned this in the comment
>>>> here https://github.com/apache/iceberg/pull/12774/files#r2254967177
>>>>
>>>> From my understanding there are two options?
>>>>
>>>> 1) We either are producing FormatModels that take a generic row type D
>>>> and produce writers that all take D and write files.
>>>>
>>>> 2) we are creating IcebergModel specific writers that take DataFile,
>>>> PositionDeleteFile, EqualityDeleteFile etc ... and write files
>>>>
>>>> The PositionDelete Converter issue seems to stem from attempting to do
>>>> both model 1 (being very generic) and 2, wanting special code to deal with
>>>> PositionDeleteFile<R> objects.
>>>>
>>>> It looks like the code in #12774 is mostly doing model 1, but we are
>>>> trying to add in a specific converter for 2?
>>>>
>>>> Maybe i'm totally lost here but I was assuming we would do something a
>>>> little scala-y like
>>>>
>>>> public <T> FileAppender<T> build(Class<T> type) {
>>>> if (type == DataFile.class) return (FileAppender<T>) new
>>>> DataFileAppender();
>>>> if (type == DeleteFile.class) return (FileAppender<T>) new
>>>> DeleteFileAppender();
>>>> // ...
>>>> }
>>>>
>>>>
>>>> So that we only register a single signature and if writer specific
>>>> implementation needs to do something special it can? I'm trying to catch
>>>> back up to speed on this PR so it may help to do a quick summary of the
>>>> current state and intent. (At least for me)
>>>>
>>>> On Tue, Sep 9, 2025 at 3:42 AM Péter Váry <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Renjie,
>>>>> Thanks for taking a look!
>>>>>
>>>>> Let me clarify a few points:
>>>>> - The converter API is only required for writing position delete files
>>>>> for V2 tables
>>>>> - Currently, there are no plans to support vectorized writing via the
>>>>> java API
>>>>> - Even if we decide to support vectorized writes, I don't think we
>>>>> would like to implement it for Positional Deletes, which are deprecated in
>>>>> the new spec.
>>>>> - Also, once the positional deletes - which contain the deleted rows -
>>>>> are deprecated (as planned), the conversion of the Position Deletes with
>>>>> only file name and position would be trivial, even for the vectorized
>>>>> writes.
>>>>>
>>>>> So from my perspective, the converter method exists purely for
>>>>> backward compatibility, and we intend to remove it as soon as possible.
>>>>> Sacrificing good practices for the sake of a deprecated feature doesn’t
>>>>> seem worthwhile to me.
>>>>>
>>>>> Thanks,
>>>>> Peter
>>>>>
>>>>> Renjie Liu <[email protected]> ezt írta (időpont: 2025. szept.
>>>>> 8., H, 12:34):
>>>>>
>>>>>> Hi, Peter:
>>>>>>
>>>>>> I would vote for the first approach. In spite of the compromises
>>>>>> described, the api is still cleaner. Also I think there are some problems
>>>>>> with the converter api. For example, for vectorized implementations such 
>>>>>> as
>>>>>> comet which accepts columnar batch rather than rows, the converter method
>>>>>> would make things more complicated.
>>>>>>
>>>>>> On Sat, Aug 30, 2025 at 2:49 PM Péter Váry <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> I’ve initiated a discussion thread regarding the deprecation of
>>>>>>> Position Deletes containing row data. You can follow it here:
>>>>>>> https://lists.apache.org/thread/8jw6pb2vq3ghmdqf1yvy8n5n6gg1fq5s
>>>>>>>
>>>>>>> We can proceed with the discussion about the native reader/writer
>>>>>>> deprecation when we decided on the final API, as the chosen design may
>>>>>>> influence our approach.
>>>>>>>
>>>>>>> Since then, one more question has come up - hopefully the last:
>>>>>>> *How should we handle Position Delete Writers?*
>>>>>>> The File Format API should return builders for either rows or
>>>>>>> PositionDelete objects. Currently the method
>>>>>>> `WriteBuilder.createWriterFunc(Function<MessageType,
>>>>>>> ParquetValueWriter<?>>)` defines the accepted input parameters for the
>>>>>>> writer. Users are responsible for ensuring that the writer function and 
>>>>>>> the
>>>>>>> return type of the `WriteBuilder.build()` are compatible. In the new 
>>>>>>> API,
>>>>>>> we no longer expose writer functions. We still expose FileContent, since
>>>>>>> writer configurations vary by content type, but we don’t expose the 
>>>>>>> types.
>>>>>>>
>>>>>>> There are two proposals for handling types for the WriteBuilders:
>>>>>>>
>>>>>>>    1. *Implicit Type Definition via FileContent* - the builder
>>>>>>>    parameter for FileContent would implicitly define the input type for 
>>>>>>> the
>>>>>>>    writer returned by build(), or
>>>>>>>    2. *Engine level conversion* - Engines would convert
>>>>>>>    PositionDelete objects to their native types.
>>>>>>>
>>>>>>> In code:
>>>>>>>
>>>>>>>    - In the 1st proposal, the FormatModel.writeBuilder(OutputFile
>>>>>>>    outputFile) can return anything:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *    WriteBuilder builder =
>>>>>>>    FormatModelRegistry.writeBuilder(PARQUET, InternalRow.class, 
>>>>>>> outputFile);
>>>>>>>      FileAppender<InternalRow> appender =         
>>>>>>> .schema(table.schema())
>>>>>>>        .content(FileContent.DATA)         ....         .build(); *
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *   // Exposed, but FormatModelRegistry.positionDeleteWriteBuilder
>>>>>>>    should be used instead    WriteBuilder builder
>>>>>>>    = FormatModelRegistry.writeBuilder(PARQUET, InternalRow.class, 
>>>>>>> outputFile);
>>>>>>>       FileAppender<PositionDelete<InternalRow>> appender =
>>>>>>>    .schema(table.schema())         
>>>>>>> .content(FileContent.POSITION_DELETES)
>>>>>>>        ....         .build();*
>>>>>>>
>>>>>>>
>>>>>>>    - In the 2nd proposal, the FormatModel needs another method:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Function<PositionDelete<D>, D> positionDeleteConverter(Schema
>>>>>>> schema);    *example implementation:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *    return delete -> {      deleteRecord.update(0,
>>>>>>> UTF8String.fromString(delete.path().toString()));
>>>>>>> deleteRecord.update(1, delete.pos());      deleteRecord.update(2,
>>>>>>> delete.row());      return deleteRecord;    };*
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *    // Content is only used for writer property configuration
>>>>>>>    WriteBuilder<InternalRow> builder =
>>>>>>>    sparkFormatModel.writeBuilder(outputFile);
>>>>>>>      FileAppender<InternalRow> appender =         
>>>>>>> .schema(table.schema())
>>>>>>>        .content(FileContent.DATA)         ....         .build();*
>>>>>>>
>>>>>>>
>>>>>>> Drawbacks
>>>>>>>
>>>>>>>    - Proposal 1:
>>>>>>>       - Type checking for the FileAppenders occurs only at runtime,
>>>>>>>       so user errors surface late.
>>>>>>>       - File Format specification must clearly specify which
>>>>>>>       builder type corresponds to which file content parameter—generics 
>>>>>>> would
>>>>>>>       offer better clarity.
>>>>>>>       - Inconsistent patterns between WriteBuilder and ReadBuilder,
>>>>>>>       as the latter can define output types via generics.
>>>>>>>    - Proposal 2:
>>>>>>>       - Requires FormatModels to implement a converter method to
>>>>>>>       transform PositionDelete<InternalRow> into InternalRow.
>>>>>>>
>>>>>>> Since we deprecated writing position delete files in the V3 spec,
>>>>>>> this extra method in the 2nd proposal will be deprecated too. As a 
>>>>>>> result,
>>>>>>> in the long run, we will have a nice, clean API.
>>>>>>> OTOH, if we accept the compromise described in the 1st proposal, the
>>>>>>> results of our decision will remain, even when the functions are 
>>>>>>> removed.
>>>>>>>
>>>>>>> Looking forward to your thoughts.
>>>>>>> Thanks, Peter
>>>>>>>
>>>>>>> On Thu, Aug 14, 2025, 14:12 Péter Váry <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Team,
>>>>>>>>
>>>>>>>> During yesterday’s community sync, we discussed the current state
>>>>>>>> of the File Format API proposal and identified two key questions that
>>>>>>>> require input from the broader community:
>>>>>>>>
>>>>>>>> *1. Dropping support for Position Delete files with Row Data*
>>>>>>>>
>>>>>>>> The current Iceberg V2 spec [1] defines two types of position
>>>>>>>> delete files:
>>>>>>>>
>>>>>>>>    - Files that store only the file name and row position.
>>>>>>>>    - Files that also store the deleted row data.
>>>>>>>>
>>>>>>>> Although this feature is defined in the spec and some tests exist
>>>>>>>> in the Iceberg codebase, we’re not aware of any actual implementation 
>>>>>>>> using
>>>>>>>> the second type (with row data). Supporting V2 table writing via the 
>>>>>>>> new
>>>>>>>> File Format API would be simpler if we dropped support for this 
>>>>>>>> feature.
>>>>>>>> If you know of any use case or reason to retain support for
>>>>>>>> position deletes with row data, please let us know.
>>>>>>>>
>>>>>>>> *2. Deprecating Native File Format Readers/Writers in the API*
>>>>>>>>
>>>>>>>> The current API contains format-specific readers/writers for
>>>>>>>> Parquet, Avro, and ORC. With the introduction of the InternalData and 
>>>>>>>> File
>>>>>>>> Format APIs, Iceberg users can now write files using:
>>>>>>>>
>>>>>>>>    - InternalData API for metadata files (manifest, manifest list,
>>>>>>>>    partition stats).
>>>>>>>>    - File Format API for data and delete files.
>>>>>>>>
>>>>>>>> I propose we deprecate the original format-specific writers and
>>>>>>>> guide users to use the new APIs based on the target file type. If 
>>>>>>>> you’re
>>>>>>>> aware of any use cases that still require the original format-specific
>>>>>>>> writers, please share them.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Peter
>>>>>>>>
>>>>>>>> [1] - Position Delete File Spec:
>>>>>>>> https://iceberg.apache.org/spec/?h=delete#position-delete-files
>>>>>>>>
>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>>> júl. 22., K, 16:09):
>>>>>>>>
>>>>>>>>> Also put together a solution where the Engine specific format
>>>>>>>>> transformation is separated from the writer, and the engines need to 
>>>>>>>>> take
>>>>>>>>> care of it separately.
>>>>>>>>> This is somewhat complicated on the implementation side (see:
>>>>>>>>> [RowDataTransformer](
>>>>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-562fa4cc369c908a157f59a9235fd3f389096451e7901686fba37c87b53dee08),
>>>>>>>>> and [InternalRowTransformer](
>>>>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-546f9dc30e3207d1d2bc0a2722976b55f5a04dcf85a22855e4f400500c317140)),
>>>>>>>>> but simplifies the API.
>>>>>>>>>
>>>>>>>>> @rdblue: Please check the proposed solution. I think this is what
>>>>>>>>> you have suggested
>>>>>>>>>
>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>>>> jún. 30., H, 18:42):
>>>>>>>>>
>>>>>>>>>> During the PR review [1], we began exploring what could we use as
>>>>>>>>>> an intermediate layer to reduce the need for engines and file 
>>>>>>>>>> formats to
>>>>>>>>>> implement the full matrix of file format - object model conversions.
>>>>>>>>>>
>>>>>>>>>> To support this discussion, I’ve created and run a set of
>>>>>>>>>> performance benchmarks and compiled a document outlining the 
>>>>>>>>>> potential
>>>>>>>>>> benefits and trade-offs [2].
>>>>>>>>>>
>>>>>>>>>> Feedback is welcome, feel free to comment on the document, the
>>>>>>>>>> PR, or directly in this thread.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>> [1] - PR discussion -
>>>>>>>>>> https://github.com/apache/iceberg/pull/12774#discussion_r2093626096
>>>>>>>>>> [2] - File Format and engine object model transformation
>>>>>>>>>> performance -
>>>>>>>>>> https://docs.google.com/document/d/1GdA8IowKMtS3QVdm8s-0X-ZRYetcHv2bhQ9mrSd3fd4
>>>>>>>>>>
>>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont:
>>>>>>>>>> 2025. máj. 7., Sze, 13:15):
>>>>>>>>>>
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>> The proposed API part is reviewed and ready to go. See:
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>>>> Thanks to everyone who reviewed it already!
>>>>>>>>>>>
>>>>>>>>>>> Many of you wanted to review, but I know that the time
>>>>>>>>>>> constraints are there for everyone. I still very much would like to 
>>>>>>>>>>> hear
>>>>>>>>>>> your voices, so I will not merge the PR this week. Please review it 
>>>>>>>>>>> if you.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Peter
>>>>>>>>>>>
>>>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont:
>>>>>>>>>>> 2025. ápr. 16., Sze, 7:02):
>>>>>>>>>>>
>>>>>>>>>>>> Hi Renjie,
>>>>>>>>>>>> The first one for the proposed new API is here:
>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>>>>> Thanks, Peter
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 16, 2025, 05:40 Renjie Liu <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi, Peter:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the effort. I totally agree with splitting them
>>>>>>>>>>>>> into smaller prs to move forward.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm quite interested in this topic, and please ping me in
>>>>>>>>>>>>> those splitted prs and I'll help to review.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Apr 14, 2025 at 11:22 PM Jean-Baptiste Onofré <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Peter
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Awesome ! Thank you so much !
>>>>>>>>>>>>>> I will do a new pass.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> JB
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Apr 11, 2025 at 3:48 PM Péter Váry <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Hi JB,
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Separated out the proposed interfaces to a new PR:
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774.
>>>>>>>>>>>>>> > Reviewers can check that out if they are only interested in
>>>>>>>>>>>>>> how the new API would look like.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Thanks,
>>>>>>>>>>>>>> > Peter
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Jean-Baptiste Onofré <[email protected]> ezt írta (időpont:
>>>>>>>>>>>>>> 2025. ápr. 10., Cs, 18:25):
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Hi Peter
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Thanks for the ping about the PR.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Maybe, to facilitate the review and move forward faster,
>>>>>>>>>>>>>> we should
>>>>>>>>>>>>>> >> split the PR in smaller PRs:
>>>>>>>>>>>>>> >> - one with the interfaces (ReadBuilder, AppenderBuilder,
>>>>>>>>>>>>>> ObjectModel,
>>>>>>>>>>>>>> >> AppenderBuilder, DataWriterBuilder, ...)
>>>>>>>>>>>>>> >> - one for each file providers (Parquet, Avro, ORC)
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Thoughts ? I can help on the split if needed.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Regards
>>>>>>>>>>>>>> >> JB
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> On Thu, Apr 10, 2025 at 5:16 AM Péter Váry <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Since the 1.9.0 release candidate has been created, I
>>>>>>>>>>>>>> would like to resurrect this PR:
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298 to ensure that
>>>>>>>>>>>>>> we have as long a testing period as possible for it.
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > To recap, here is what the PR does after the review
>>>>>>>>>>>>>> rounds:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Created 3 interface classes which are implemented by the
>>>>>>>>>>>>>> file formats:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > ReadBuilder - Builder for reading data from data files
>>>>>>>>>>>>>> >> > AppenderBuilder - Builder for writing data to data files
>>>>>>>>>>>>>> >> > ObjectModel - Providing ReadBuilders, and
>>>>>>>>>>>>>> AppenderBuilders for the specific data file format and object 
>>>>>>>>>>>>>> model pair
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Updated the Parquet, Avro, ORC implementation for this
>>>>>>>>>>>>>> interfaces, and deprecated the old reader/writer APIs
>>>>>>>>>>>>>> >> > Created interface classes which will be used by the
>>>>>>>>>>>>>> actual readers/writers of the data files:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > AppenderBuilder - Builder for writing a file
>>>>>>>>>>>>>> >> > DataWriterBuilder - Builder for generating a data file
>>>>>>>>>>>>>> >> > PositionDeleteWriterBuilder - Builder for generating a
>>>>>>>>>>>>>> position delete file
>>>>>>>>>>>>>> >> > EqualityDeleteWriterBuilder - Builder for generating an
>>>>>>>>>>>>>> equality delete file
>>>>>>>>>>>>>> >> > No ReadBuilder here - the file format reader builder is
>>>>>>>>>>>>>> reused
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Created a WriterBuilder class which implements the
>>>>>>>>>>>>>> interfaces above
>>>>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>>>>> based on a provided file format specific AppenderBuilder
>>>>>>>>>>>>>> >> > Created an ObjectModelRegistry which stores the
>>>>>>>>>>>>>> available ObjectModels, and engines and users could request the 
>>>>>>>>>>>>>> readers
>>>>>>>>>>>>>> (ReadBuilder) and writers
>>>>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>>>>> from.
>>>>>>>>>>>>>> >> > Created the appropriate ObjectModels:
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > GenericObjectModels - for reading and writing Iceberg
>>>>>>>>>>>>>> Records
>>>>>>>>>>>>>> >> > SparkObjectModels - for reading (vectorized and
>>>>>>>>>>>>>> non-vectorized) and writing Spark InternalRow/ColumnarBatch 
>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>> >> > FlinkObjectModels - for reading and writing Flink
>>>>>>>>>>>>>> RowData objects
>>>>>>>>>>>>>> >> > An arrow object model is also registered for vectorized
>>>>>>>>>>>>>> reads of Parquet files into Arrow ColumnarBatch objects
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Updated the production code where the reading and
>>>>>>>>>>>>>> writing happens to use the ObjectModelRegistry and the new 
>>>>>>>>>>>>>> reader/writer
>>>>>>>>>>>>>> interfaces to access data files
>>>>>>>>>>>>>> >> > Kept the testing code intact to ensure that the new
>>>>>>>>>>>>>> API/code is not breaking anything
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > The original change was not small, and grew
>>>>>>>>>>>>>> substantially during the review rounds. So if you have 
>>>>>>>>>>>>>> questions, or I can
>>>>>>>>>>>>>> do anything to make the review easier, don't hesitate to ask. I 
>>>>>>>>>>>>>> am happy to
>>>>>>>>>>>>>> do anything to move this forward.
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>>>>> >> > Peter
>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>> >> > Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>>> (időpont: 2025. márc. 26., Sze, 14:54):
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> Hi everyone,
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> I have updated the File Format API PR (
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) based on the
>>>>>>>>>>>>>> answers and review comments.
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> I would like to merge this only after the 1.9.0 release
>>>>>>>>>>>>>> so we have more time finding any issues and solving them before 
>>>>>>>>>>>>>> this goes
>>>>>>>>>>>>>> to a release for the users.
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> For this I have updated the deprecation comments
>>>>>>>>>>>>>> accordingly.
>>>>>>>>>>>>>> >> >> I would like to ask you to review the PR, so we iron
>>>>>>>>>>>>>> out any possible requested changes and be ready for the merge as 
>>>>>>>>>>>>>> soon as
>>>>>>>>>>>>>> possible after the 1.9.0 release.
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> Thanks,
>>>>>>>>>>>>>> >> >> Peter
>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>> >> >> Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:32):
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> Hi Renije,
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> > 1. File format filters
>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>> >> >>> > Do the filters include both filter expressions from
>>>>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> The current discussion is about the filters from the
>>>>>>>>>>>>>> user query.
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> About the delete filter:
>>>>>>>>>>>>>> >> >>> Based on the suggestions on the PR, I have moved the
>>>>>>>>>>>>>> delete filter out from the main API. Created a 
>>>>>>>>>>>>>> `SupportsDeleteFilter`
>>>>>>>>>>>>>> interface for it which would allow pushing down to the filter to 
>>>>>>>>>>>>>> Parquet
>>>>>>>>>>>>>> vectorized readers in Spark, as this is the only place where we 
>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>> implemented this feature.
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>> >> >>> Renjie Liu <[email protected]> ezt írta
>>>>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:11):
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> Hi, Peter:
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> Thanks for the effort on this.
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> 1. File format filters
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> Do the filters include both filter expressions from
>>>>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> For filters from user query, I agree with you that we
>>>>>>>>>>>>>> should keep the current behavior.
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> For delete filters associated with data files, at
>>>>>>>>>>>>>> first I thought file format readers should not care about this. 
>>>>>>>>>>>>>> But now I
>>>>>>>>>>>>>> realized that maybe we need to also push it to file reader, this 
>>>>>>>>>>>>>> is useful
>>>>>>>>>>>>>> when `IS_DELETED` metadata column is not necessary and we could 
>>>>>>>>>>>>>> use these
>>>>>>>>>>>>>> filters (position deletes, etc) to further prune data.
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> But anyway, I agree that we could postpone it in
>>>>>>>>>>>>>> follow up pr.
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> 2. Batch size configuration
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> I'm leaning toward option 2.
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> 3. Spark configuration
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> I'm leaning towards using different configuration
>>>>>>>>>>>>>> objects.
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>> >> >>>> On Thu, Mar 20, 2025 at 10:23 PM Péter Váry <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> Hi Team,
>>>>>>>>>>>>>> >> >>>>> Thanks everyone for the reviews on
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298!
>>>>>>>>>>>>>> >> >>>>> I have addressed most of comments, but a few
>>>>>>>>>>>>>> questions still remain which might merit a bit wider audience:
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> We should decide on the expected filtering behavior
>>>>>>>>>>>>>> when the filters are pushed down to the readers. Currently the 
>>>>>>>>>>>>>> filters are
>>>>>>>>>>>>>> applied as best effort for the file format readers. Some readers 
>>>>>>>>>>>>>> (Avro)
>>>>>>>>>>>>>> just skip them altogether. There was a suggestion on the PR that 
>>>>>>>>>>>>>> we might
>>>>>>>>>>>>>> enforce more strict requirements and the readers either reject 
>>>>>>>>>>>>>> part of the
>>>>>>>>>>>>>> filters, or they could apply them fully.
>>>>>>>>>>>>>> >> >>>>> Batch sizes are currently parameters for the reader
>>>>>>>>>>>>>> builders which could be set for non-vectorized readers too which 
>>>>>>>>>>>>>> could be
>>>>>>>>>>>>>> confusing.
>>>>>>>>>>>>>> >> >>>>> Currently the Spark batch reader uses different
>>>>>>>>>>>>>> configuration objects for ParquetBatchReadConf and 
>>>>>>>>>>>>>> OrcBatchReadConf as
>>>>>>>>>>>>>> requested by the reviewers of the Comet PR. There was a 
>>>>>>>>>>>>>> suggestion on the
>>>>>>>>>>>>>> current PR to use a common configuration instead.
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> I would be interested in hearing your thoughts about
>>>>>>>>>>>>>> these topics.
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> My current take:
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> File format filters: I am leaning towards keeping
>>>>>>>>>>>>>> the current laninet behavior. Especially since Bloom filters are 
>>>>>>>>>>>>>> not able
>>>>>>>>>>>>>> to do a full filtering, and are often used as a way to filter 
>>>>>>>>>>>>>> out unwanted
>>>>>>>>>>>>>> records. Another option would be to implement a secondary 
>>>>>>>>>>>>>> filtering inside
>>>>>>>>>>>>>> the file formats themselves which I think would cause extra 
>>>>>>>>>>>>>> complexity, and
>>>>>>>>>>>>>> possible code duplication. Whatever the decision here, I would 
>>>>>>>>>>>>>> suggest
>>>>>>>>>>>>>> moving this out to a next PR as the current changeset is big 
>>>>>>>>>>>>>> enough as it
>>>>>>>>>>>>>> is.
>>>>>>>>>>>>>> >> >>>>> Batch size configuration: Currently this is the only
>>>>>>>>>>>>>> property which is different in the batch readers and the 
>>>>>>>>>>>>>> non-vectorized
>>>>>>>>>>>>>> readers. I see 3 possible solutions:
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> Create different builders for vectorized and
>>>>>>>>>>>>>> non-vectorized reads - I don't think the current solution is 
>>>>>>>>>>>>>> confusing
>>>>>>>>>>>>>> enough to worth the extra class
>>>>>>>>>>>>>> >> >>>>> We could put this to the reader configuration
>>>>>>>>>>>>>> property set - This could work, but "hide" the possible 
>>>>>>>>>>>>>> configuration mode
>>>>>>>>>>>>>> which is valid for both Parquet and ORC readers
>>>>>>>>>>>>>> >> >>>>> We could keep things as it is now - I would chose
>>>>>>>>>>>>>> this one, but I don't have a strong opinion here
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> Spark configuration: TBH, I'm open to bot solution
>>>>>>>>>>>>>> and happy to move to the direction the community decides on
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> Thanks,
>>>>>>>>>>>>>> >> >>>>> Peter
>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>> >> >>>>> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>>>>> (időpont: 2025. márc. 14., P, 16:31):
>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>> >> >>>>>> Hi Peter
>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>> >> >>>>>> Thanks for the update. I will do a new pass on the
>>>>>>>>>>>>>> PR.
>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>> >> >>>>>> Regards
>>>>>>>>>>>>>> >> >>>>>> JB
>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>> >> >>>>>> On Thu, Mar 13, 2025 at 1:16 PM Péter Váry <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>>>>> >> >>>>>> > Hi Team,
>>>>>>>>>>>>>> >> >>>>>> > I have rebased the File Format API proposal (
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) to include the
>>>>>>>>>>>>>> new changes needed for the Variant types. I would love to hear 
>>>>>>>>>>>>>> your
>>>>>>>>>>>>>> feedback, especially Dan and Ryan, as you were the most active 
>>>>>>>>>>>>>> during our
>>>>>>>>>>>>>> discussions. If I can help in any way to make the review easier, 
>>>>>>>>>>>>>> please let
>>>>>>>>>>>>>> me know.
>>>>>>>>>>>>>> >> >>>>>> > Thanks,
>>>>>>>>>>>>>> >> >>>>>> > Peter
>>>>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>>>>> >> >>>>>> > Péter Váry <[email protected]> ezt
>>>>>>>>>>>>>> írta (időpont: 2025. febr. 28., P, 17:50):
>>>>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>>>>> >> >>>>>> >> Hi everyone,
>>>>>>>>>>>>>> >> >>>>>> >> Thanks for all of the actionable, relevant
>>>>>>>>>>>>>> feedback on the PR (
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298).
>>>>>>>>>>>>>> >> >>>>>> >> Updated the code to address most of them. Please
>>>>>>>>>>>>>> check if you agree with the general approach.
>>>>>>>>>>>>>> >> >>>>>> >> If there is a consensus about the general
>>>>>>>>>>>>>> approach, I could. separate out the PR to smaller pieces so we 
>>>>>>>>>>>>>> can have an
>>>>>>>>>>>>>> easier time to review and merge those step-by-step.
>>>>>>>>>>>>>> >> >>>>>> >> Thanks,
>>>>>>>>>>>>>> >> >>>>>> >> Peter
>>>>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>>>>> >> >>>>>> >> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>>>>> (időpont: 2025. febr. 20., Cs, 14:14):
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> Hi Peter
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> sorry for the late reply on this.
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> I did a pass on the proposal, it's very
>>>>>>>>>>>>>> interesting and well written.
>>>>>>>>>>>>>> >> >>>>>> >>> I like the DataFile API and definitely worth to
>>>>>>>>>>>>>> discuss all together.
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> Maybe we can schedule a specific meeting to
>>>>>>>>>>>>>> discuss about DataFile API ?
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> Thoughts ?
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> Regards
>>>>>>>>>>>>>> >> >>>>>> >>> JB
>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>> >> >>>>>> >>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>> >> >>>>>> >>> > Hi Team,
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>> >> >>>>>> >>> > As mentioned earlier on our Community Sync I
>>>>>>>>>>>>>> am exploring the possibility to define a FileFormat API for 
>>>>>>>>>>>>>> accessing
>>>>>>>>>>>>>> different file formats. I have put together a proposal based on 
>>>>>>>>>>>>>> my findings.
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>>>>> >> >>>>>> >>> > Iceberg currently supports 3 different file
>>>>>>>>>>>>>> formats: Avro, Parquet, ORC. With the introduction of Iceberg V3
>>>>>>>>>>>>>> specification many new features are added to Iceberg. Some of 
>>>>>>>>>>>>>> these
>>>>>>>>>>>>>> features like new column types, default values require changes 
>>>>>>>>>>>>>> at the file
>>>>>>>>>>>>>> format level. The changes are added by individual developers 
>>>>>>>>>>>>>> with different
>>>>>>>>>>>>>> focus on the different file formats. As a result not all of the 
>>>>>>>>>>>>>> features
>>>>>>>>>>>>>> are available for every supported file format.
>>>>>>>>>>>>>> >> >>>>>> >>> > Also there are emerging file formats like
>>>>>>>>>>>>>> Vortex [1] or Lance [2] which either by specialization, or by 
>>>>>>>>>>>>>> applying
>>>>>>>>>>>>>> newer research results could provide better alternatives for 
>>>>>>>>>>>>>> certain
>>>>>>>>>>>>>> use-cases like random access for data, or storing ML models.
>>>>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>> >> >>>>>> >>> > Please check the detailed proposal [3] and
>>>>>>>>>>>>>> the google document [4], and comment there or reply on the dev 
>>>>>>>>>>>>>> list if you
>>>>>>>>>>>>>> have any suggestions.
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>> >> >>>>>> >>> > Thanks,
>>>>>>>>>>>>>> >> >>>>>> >>> > Peter
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>> >> >>>>>> >>> > [1] - https://github.com/spiraldb/vortex
>>>>>>>>>>>>>> >> >>>>>> >>> > [2] - https://lancedb.github.io/lance/
>>>>>>>>>>>>>> >> >>>>>> >>> > [3] -
>>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/12225
>>>>>>>>>>>>>> >> >>>>>> >>> > [4] -
>>>>>>>>>>>>>> https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds
>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Re: [DISCUSS] FileFormat API proposal

Reply via email to