Re: [QUESTION] What type promotion actually means

Micah Kornfield Sat, 06 Sep 2025 21:16:15 -0700

https://github.com/apache/iceberg/pull/14002 for revert, I'll post a new PR
which try to accommodate the replies on the thread.


On Fri, Sep 5, 2025 at 8:53 AM Ryan Blue <[email protected]> wrote:

> Would anyone object to reverting this change? I don't think that it makes
> accurate statements. For example, I don't think that "not writing out all
> columns or not using the latest schema can change the semantics of the data
> written" is correct. It is perfectly fine to append with an older schema
> and writers do this today. Saying that may "change the semantics of the
> data written" makes this much scarier than it actually is.
>
> I think this confuses cases and errs too strongly on discouraging
> reasonable things. It is, of course, bad to use an older schema that drops
> columns from rows that are being updated. But that's a problem with the
> engine choosing to drop current data columns more than it is a schema
> selection issue.
>
> I think that the right path here is to state that operations should be
> bound to the current schema when they are planned. It is okay to continue
> producing data with an older schema in currently-running jobs. And we
> should call out the behavior of defaults. But I don't think that we should
> use language that clearly states what will happen rather than a generic
> "may change the semantics of data written".
>
> On Fri, Sep 5, 2025 at 5:05 AM Nicolae Vartolomei
> <[email protected]> wrote:
>
>> Quoting the newly added text in the spec:
>>
>> > Writers must write out all fields with the types specified from a
>> schema present in table metadata. Writers should use the latest schema for
>> writing. Not writing out all columns or not using the latest schema can
>> change the semantics of the data written. The following are possible
>> inconsistencies that can be introduced:
>>
>> Interpreting this for the Iceberg REST Catalog interaction: writers
>> should include the assert-current-schema-id requirement when
>> committing to a table. Otherwise, inconsistencies may be introduced. I
>> might be on a slightly outdated Iceberg build, but from what I can
>> see, Spark with Iceberg doesn’t include this requirement during table
>> commits (example:
>> https://gist.github.com/nvartolomei/2fd6e994d1d9b5d61597c16339100518).
>>
>> Do we consider this an implementation bug then?
>>
>> Also, what should a sensible implementation do when the requirement
>> fails? Reload table schema and check if new schema is compatible with
>> writer schema (i.e. it wouldn't be subject to the possible
>> inconsistencies as described by the spec) and otherwise rewrite the
>> data? For example rewrite parquet files by adding new all-null fields
>> if needed, etc.
>>
>> On Thu, Sep 4, 2025 at 10:19 PM Micah Kornfield <[email protected]>
>> wrote:
>> >
>> > Just to follow-up the PR ended up getting merged.  In theory, there
>> maybe should have been a vote, but unless someone feels strongly or would
>> like to fine tune the language further perhaps we can consider the topic
>> resolved?
>> >
>> > On Wed, Aug 27, 2025 at 2:17 PM Micah Kornfield <[email protected]>
>> wrote:
>> >>
>> >> I opened https://github.com/apache/iceberg/pull/13936 as a draft
>> proposal to capture the conversation.
>> >>
>> >> BTW, I think one area this brings up that I don't think the
>> specification handles is changing between nullable and not-nullable fields.
>> Outdated schemas have some implications in these cases as well.
>> >>
>> >> Cheers,
>> >> Micah
>> >>
>> >> On Tue, Aug 26, 2025 at 10:13 AM Micah Kornfield <
>> [email protected]> wrote:
>> >>>
>> >>> I think the original question is ambiguous.  We should probably
>> subset this into two questions:
>> >>>
>> >>> 1.  Is it OK to write out an "int" instead of a "long" if the
>> writer's schema says the value is a long?
>> >>>
>> >>> I think the answer here is we recommended not doing so, even though
>> it would likely work.
>> >>>
>> >>> 2. Is it OK to use an older schema for writing?
>> >>>
>> >>> The consensus on the thread seems to be yes.  I'll note that this can
>> cause confusing results when the "write-default" [1] value for a column
>> changes.  We should probably have an implementation note to clarify:
>> >>> a.  Using a stale schema is allowed
>> >>> b.  It might cause inconsistent results in the face of multiple
>> writers when default values are used.
>> >>>
>> >>> Thoughts?
>> >>>
>> >>> Thanks,
>> >>> Micah
>> >>>
>> >>> On Mon, Aug 25, 2025 at 4:59 PM Ryan Blue <[email protected]> wrote:
>> >>>>
>> >>>> I agree with Dan that type promotion should be well-defined. If it's
>> a grey area then we should clarify it in the spec.
>> >>>>
>> >>>> How it works today is that schema evolution always produces a schema
>> that can read files written with any older schema. When a type is promoted,
>> the new schema can read any older data file, but readers may need to
>> promote values like the [int-to-long reader](
>> https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetValueReaders.java#L546-L560)
>> does. You aren't guaranteed to be able to read new data using an older
>> schema, so the latest schema should always be used or you should use the
>> schema attached to a snapshot.
>> >>>>
>> >>>> Because files with older schemas can always be read, it is safe to
>> write files with an older schema. This happens fairly regularly, as Steven
>> noted, in cases where a writer has a fixed schema and is long-running.
>> >>>>
>> >>>> Ryan
>> >>>>
>> >>>> On Thu, Aug 21, 2025 at 5:37 PM Steven Wu <[email protected]>
>> wrote:
>> >>>>>
>> >>>>> > This means that you can have writers using different schema to
>> write (use cases include different partitioning or "out-of-date" writers),
>> but the data is still valid.
>> >>>>>
>> >>>>> +1 on Dan's point. Both batch and streaming writers can have stale
>> schema. long-running streaming jobs may stay stale for extended periods
>> before picking up the new schema during restart.
>> >>>>>
>> >>>>> On Wed, Aug 20, 2025 at 2:50 PM Daniel Weeks <[email protected]>
>> wrote:
>> >>>>>>
>> >>>>>> I think I'm going to disagree and argue that it's not really a
>> gray area.
>> >>>>>>
>> >>>>>> Having strict schema evolution rules and how schema's are tracked
>> means that there is independence between writer and reader schemas which
>> remain compatible due to the evolution rules.
>> >>>>>>
>> >>>>>> This means that you can have writers using different schema to
>> write (use cases include different partitioning or "out-of-date" writers),
>> but the data is still valid.
>> >>>>>>
>> >>>>>> How you promote physical representation during a read/scan
>> operation results in a consistent presentation with the read schema.
>> >>>>>>
>> >>>>>> All of the representations are technically valid.
>> >>>>>>
>> >>>>>> -Dan
>> >>>>>>
>> >>>>>> On Mon, Aug 18, 2025 at 7:46 AM Russell Spitzer <
>> [email protected]> wrote:
>> >>>>>>>
>> >>>>>>> +1 to what Micah said :) sorry about the typo
>> >>>>>>>
>> >>>>>>> On Mon, Aug 18, 2025 at 9:45 AM Russell Spitzer <
>> [email protected]> wrote:
>> >>>>>>>>
>> >>>>>>>> +1 to what Micaah , We have never really written rules about
>> what is "allowed" in this particular context but since
>> >>>>>>>> a reader needs to be able to handle both int/long values for the
>> column, there isn't really any danger in writing
>> >>>>>>>> new files with the narrower type. If a reader couldn't handle
>> this, then type promotion would be impossible.
>> >>>>>>>>
>> >>>>>>>> I would include all columns in the file, the space requirements
>> for an all null column (or all constant column) should
>> >>>>>>>> be very small. I believe the reason we original wrote those
>> rules in was to avoid folks doing the Hive Style
>> >>>>>>>> implicit columns from partition tuple (although we also have
>> handling for this.)
>> >>>>>>>>
>> >>>>>>>> On Sun, Aug 17, 2025 at 11:15 PM Micah Kornfield <
>> [email protected]> wrote:
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>  Hi Nic,
>> >>>>>>>>> This is IMO a gray area.
>> >>>>>>>>>
>> >>>>>>>>>> However, is it allowed to commit *new* parquet files with the
>> old
>> >>>>>>>>>> types (int) and commit them to the table with a table schema
>> where
>> >>>>>>>>>> types are promoted (long)?
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> IMO  I would expect writers to be writing files that are
>> consistent with the current metadata, so ideally they would not be written
>> with int if it is now long.  In general, though in these cases I think most
>> readers are robust to reading type promoted files.  We should probably
>> clarify in the specification.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>> Also, is it allowed to commit parquet files, in general, which
>> contain
>> >>>>>>>>>> only a subset of columns of table schema? I.e. if I know a
>> column is
>> >>>>>>>>>> all NULLs, can we just skip writing it?
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> As currently worded the spec on writing data files (
>> https://iceberg.apache.org/spec/#writing-data-files) should include all
>> columns. Based on column projection rules, however, failing to do so should
>> also not cause problems.
>> >>>>>>>>>
>> >>>>>>>>> Cheers,
>> >>>>>>>>> Micah
>> >>>>>>>>>
>> >>>>>>>>> On Fri, Aug 15, 2025 at 8:45 AM Nicolae Vartolomei
>> <[email protected]> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> I'm implementing an Iceberg writer[^1] and have a question
>> about what
>> >>>>>>>>>> type promotion actually means as part of schema evolution
>> rules.
>> >>>>>>>>>>
>> >>>>>>>>>> Iceberg spec [specifies][spec-evo] which type promotions are
>> allowed.
>> >>>>>>>>>> No confusion there.
>> >>>>>>>>>>
>> >>>>>>>>>> The confusion on my end arises when it comes to actually
>> writing i.e.
>> >>>>>>>>>> parquet data. Let's take for example the int to long
>> promotion. What
>> >>>>>>>>>> is actually allowed under this promotion rule? Let me try to
>> show what
>> >>>>>>>>>> I mean.
>> >>>>>>>>>>
>> >>>>>>>>>> Obviously if I have a schema-id N with field A of type int and
>> table
>> >>>>>>>>>> snapshots with this schema then it is possible to update the
>> table
>> >>>>>>>>>> schema-id to > N where field A now has type long and this new
>> schema
>> >>>>>>>>>> can read parquet files with the old type.
>> >>>>>>>>>>
>> >>>>>>>>>> However, is it allowed to commit *new* parquet files with the
>> old
>> >>>>>>>>>> types (int) and commit them to the table with a table schema
>> where
>> >>>>>>>>>> types are promoted (long)?
>> >>>>>>>>>>
>> >>>>>>>>>> Also, is it allowed to commit parquet files, in general, which
>> contain
>> >>>>>>>>>> only a subset of columns of table schema? I.e. if I know a
>> column is
>> >>>>>>>>>> all NULLs, can we just skip writing it?
>> >>>>>>>>>>
>> >>>>>>>>>> Appreciate taking the time to look at this,
>> >>>>>>>>>> Nic
>> >>>>>>>>>>
>> >>>>>>>>>> [spec-evo]: https://iceberg.apache.org/spec/#schema-evolution
>> >>>>>>>>>> [^1]: This is for Redpanda to Iceberg native integration
>> >>>>>>>>>> (https://github.com/redpanda-data/redpanda).
>>
>

Re: [QUESTION] What type promotion actually means

Reply via email to