> A major challenge with UniForm right now is its limitation regarding Deletion Vectors (DVs). Support for this is critical for many users migrating their workloads.
The reason why Uniform v1/v2 blocked DVs was because Iceberg v1/v2 had a different positional delete representation than Delta Lake. But that changed in Iceberg v3. So the upcoming version of Uniform (IcebergCompatV3 <https://github.com/delta-io/delta/blob/master/protocol_rfcs/iceberg-compat-v3.md>) will lift this restriction. On Mon, Mar 2, 2026 at 10:48 AM Vladislav Sidorovich via dev < [email protected]> wrote: > Hi Anoop, > > Thanks for the feedback and for raising these important points. > > Regarding the technical feedback on minimizing the use of internal Delta > Kernel classes: I completely agree. Relying on internal APIs like AddFile > introduces an unnecessary maintenance burden. My plan is to refactor the > code (e.g., transitioning to the Row API) once we have alignment on the > core features this PR will support. I will also put together a list of the > gaps I've encountered in the Kernel API (such as change detection) so we > can file those upstream, as you suggested. > > As a quick update on the PR's progress: I’ve recently added support for > UPDATE and DELETE operations, along with expanded test coverage. At this > stage, the PR is roughly at feature parity with the existing tool > (excluding VACUUM) but supports newer Delta versions. As outlined in the > PR description, the next features on the roadmap are: > > 1. VACUUM support > 2. Deletion Vectors (DVs) support > 3. Incremental conversion > > > *Bigger question*. To address your broader question about whether we > should consider sunsetting the Delta Lake module in favor of Delta UniForm: > based on my experience and observations, there are still compelling reasons > to maintain a native Iceberg-driven conversion tool. > > - > > *Feature Limitations:* A major challenge with UniForm right now is its > limitation regarding Deletion Vectors (DVs). Support for this is critical > for many users migrating their workloads. > - > > *User Preference:* I've observed that teams looking to migrate to > Iceberg strongly prefer "native" tooling maintained by the technology they > are migrating *to*, rather than relying on the ecosystem they are > trying to move *from*. Having an in-house Iceberg tool gives the > community more control over the migration experience. > > Let me know your thoughts on the above, particularly regarding the > long-term need for a native migration path. > > Best, Vladislav > > On Thu, Feb 26, 2026 at 8:07 PM Anoop Johnson <[email protected]> wrote: > >> Vladislav, >> >> We should minimize the usage of internal Delta kernel classes as much >> as possible. There are no guarantees about the stability of the internal >> APIs, and it will be a maintenance burden on the Iceberg project. For >> instance, instead of using the internal `AddFile` class use the `Row` API >> using ordinals defined by the scan file schema. I do recognize that there >> are some gaps in the kernel API (you mentioned change detection): do you >> have a list? It would be worth filing an issue against Delta kernel, it is >> possible some of these like providing file changes might be in their >> roadmap. >> >> *I have a higher level question to the community:* should we consider >> sunsetting the Delta lake module? Delta Lake's Uniform >> <https://docs.delta.io/delta-uniform/> can already generate Iceberg >> metadata: it is incremental, and already handles several features such as >> column mapping. Do we need to duplicate all of that work? Obviously it is >> better to have less code and less components to maintain. >> >> Best, >> Anoop >> >> Disclosure: I work on Delta also as part of my day job. >> >> >> On Wed, Feb 25, 2026 at 1:44 PM Vladislav Sidorovich < >> [email protected]> wrote: >> >>> Hi Anoop, >>> >>> Thanks a lot for the initial review. >>> >>> Data correctness guards: >>> 1. I will add support for Remove action soon, work on the PR is in >>> progress. >>> 2. Sure, let's do reject for `column mapping` feature for now for the >>> safety. Later I will try to provide support of this feature as well. >>> >>> >>> Yes, the PR depends on `*internal*` API of the delta-kernel. I do not >>> see a simple way to replace it with the public API. As an option I can >>> replace these classes with our `in-house` classes that would rely on the >>> Dela protocol spec, it will be safe in terms of runtime but it will be >>> additional code that we will need to support. >>> >>> What do you think if I will continue work with `*internal*` delta API >>> for now and refactor this logic before merging the PR once we will agree on >>> some solutions? >>> >>> >>> On Tue, Feb 24, 2026 at 5:29 AM Anoop Johnson <[email protected]> wrote: >>> >>>> Hi, Vladislav - >>>> >>>> I've done an initial review of the PR >>>> <https://github.com/apache/iceberg/pull/15407>. Moving to the Delta >>>> kernel is the right direction, so thank you for doing this. Here's a >>>> summary of my initial feedback (full details are in the PR): >>>> >>>> Data correctness guards: >>>> 1. If we encounter `Remove` actions, it should fail fast rather than >>>> silently skip it. Otherwise tables with DML will produce duplicate rows in >>>> the Iceberg table. >>>> 2. Tables with column mapping enabled) will produce silent data >>>> corruption because the Parquet files will have physical column names that >>>> don't match the logical schema. We should validate this and reject until >>>> column mapping support is added (which can be done as a separate PR). >>>> >>>> The PR relies heavily on io.delta.kernel.internal.* classes, which can >>>> be fragile. We should consider replacing them with the public kernel APIs. >>>> >>>> Best, >>>> Anoop >>>> >>>> >>>> On Mon, Feb 23, 2026 at 12:29 AM Vladislav Sidorovich via dev < >>>> [email protected]> wrote: >>>> >>>>> Hi Iceberg Community, >>>>> >>>>> I recently opened a PR to update the existing Delta Lake to Iceberg >>>>> migration functionality to support recent Delta Lake table versions (read: >>>>> 3, write: 7). I would appreciate it if anyone take a look and share >>>>> thoughts on the architecture and initial implementation >>>>> >>>>> *PR Link:* https://github.com/apache/iceberg/pull/15407 >>>>> >>>>> The main motivation for sharing this now is to get some early feedback >>>>> from the community on the approach and the initial implementation. >>>>> >>>>> To make reviewing easier, this PR doesn't remove or overwrite the old >>>>> logic. Instead, I’ve added a new interface implementation utilizing the >>>>> *Delta >>>>> Lake Kernel library* (replacing the deprecated Delta Lake standalone >>>>> library). This side-by-side approach allows for easier comparison and >>>>> shouldn't introduce any issues with current usage scenarios. >>>>> >>>>> >>>>> *Current PR Scope:* >>>>> >>>>> - Maintains support for the existing migration interface. >>>>> - Migrates the underlying engine to the Delta Lake Kernel library. >>>>> - Contains the basic migration flow. >>>>> - Successfully converts all data types, table schemas, and >>>>> partition specs. >>>>> - Currently supports INSERT operations only (Delta Lake Add >>>>> action). >>>>> - *Testing:* Includes unit tests for all supported data types >>>>> (including complex arrays and structures) and integration tests for >>>>> insert-only scenarios using Spark 3.5. >>>>> >>>>> *Future Steps (Next PRs):* >>>>> >>>>> Once we align on this foundation, I plan to follow up with: >>>>> >>>>> - Adding support for UPDATE and DELETE (Delta Lake Remove action). >>>>> - Supporting all remaining Delta Lake actions. >>>>> - Handling edge cases for partitions and generated columns. >>>>> - Adding Schema Evolution support. >>>>> - Adding Deletion Vector (DV) support. >>>>> - Enabling Incremental Conversion (from/to specific Delta >>>>> versions). >>>>> - Adding all tables from the Delta golden tables for robust >>>>> testing. *(Note: The current integration test will be updated for >>>>> newer Delta Lake versions once the old standalone solution is fully >>>>> deprecated/deleted).* >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Vladislav Sidorovich >>>>> >>>>> Feedback: *go/feedback-for-vladislav >>>>> <https://goto.google.com/feedback-for-vladislav> * >>>>> [image: Google Logo] >>>>> >>>>> >>>>> >>> >>> -- >>> Best regards, >>> Vladislav Sidorovich >>> >>> Feedback: *go/feedback-for-vladislav >>> <https://goto.google.com/feedback-for-vladislav> * >>> [image: Google Logo] >>> >>> >>> > > -- > Best regards, > Vladislav Sidorovich > > Feedback: *go/feedback-for-vladislav * > [image: Google Logo] > > >
