For tables where this is a problem, how are you currently managing older
schemas? Older schemas do not need to be kept if there aren't any snapshots
that reference them.

On Thu, Feb 12, 2026 at 10:24 AM Russell Spitzer <[email protected]>
wrote:

> My gut instinct on this is that it's a great idea. I think we probably
> need to think a bit more about how to decide on "base" schema promotion but
> theoretically this seems like it should be a huge benefit for wide tables.
>
> On Thu, Feb 12, 2026 at 7:55 AM Talat Uyarer via dev <
> [email protected]> wrote:
>
>> Hi All,
>>
>> I am sharing a new proposal for Iceberg Spec v4: *Delta-Encoded Schemas*
>> . We propose moving away from monolithic schema storage to address a
>> growing scalability bottleneck in high-velocity and ultra-wide table
>> environments.
>>
>> The current Iceberg Spec re-serializes and appends the entire schema
>> object to metadata.json for every schema operation, which leads to
>> massive schema data replication. For a large table with 5,000 columns+
>> with frequent schema updates, this can result in metadata files exceeding
>> GBs, causing significant query planning latencies and OOM driver side.
>>
>> *Proposal Summary:*
>>
>> We propose implementing *Delta-Encoded Schema Evolution for Spec v4* using
>> a *"Merge-on-Read" (MoR) approach for metadata*. This approach involves
>> transitioning the schemas field from "Full Snapshots" to a sequence of *Base
>> Schemas* (type full) and *Schema Deltas* (type delta) that store
>> differential mutations relative to a base ID.
>>
>> *Key Goals:*
>>
>>    - Achieve a *99.4% reduction in the size of schema-related metadata*.
>>    - Drastically lower the storage and IO requirements for metadata.json
>>    .
>>    - Accelerate query planning by reducing the JSON payload size.
>>    - Preserve self-containment by keeping the schema in the metadata
>>    file, avoiding external sidecar files.
>>
>> The full proposal, including the flat resolution model (no delta
>> chaining), the defined set of atomic delta operations (add, update,
>> delete), and the lifecycle/compaction mechanics, is available for review:
>>
>> https://s.apache.org/iceberg-delta-schemas
>> <https://www.google.com/url?source=gmail&sa=E&q=https://s.apache.org/iceberg-delta-schemas>
>>
>> I look forward to your feedback and discussion on the dev list.
>>
>> Thanks
>> Talat
>>
>

Reply via email to