This is an automated email from the ASF dual-hosted git repository.
ashvin pushed a commit to branch 631-rfc-for-deletion-vector
in repository https://gitbox.apache.org/repos/asf/incubator-xtable.git
The following commit(s) were added to refs/heads/631-rfc-for-deletion-vector by
this push:
new 421876f7 Add change proposal for conversion of deletion vectors
421876f7 is described below
commit 421876f75dec06c3a98f49136c618ed5f78a58e7
Author: Ashvin Agrawal <[email protected]>
AuthorDate: Tue Jan 28 09:15:12 2025 -0800
Add change proposal for conversion of deletion vectors
---
rfc/rfc-2/2 - Deletion Info Conversion.md | 159 ++++++++++++++++++++++++++++++
1 file changed, 159 insertions(+)
diff --git a/rfc/rfc-2/2 - Deletion Info Conversion.md b/rfc/rfc-2/2 - Deletion
Info Conversion.md
new file mode 100644
index 00000000..53d9c7f5
--- /dev/null
+++ b/rfc/rfc-2/2 - Deletion Info Conversion.md
@@ -0,0 +1,159 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+# RFC-2: Support for conversion of Deletion Vectors from Delta Lake to Iceberg
format
+
+## Proposers**
+
+- @ashvin
+
+## Status
+
+GH Feature Request: https://github.com/apache/incubator-xtable/issues/339
+
+## Abstract
+Deletion vectors are one of the most popular features in table formats,
allowing logical deletion of records, by marking
+them as deleted, without requiring rewriting of the data files to remove them
physically. This document provides a
+high-level proposal of deletion vector conversion in XTable, focusing on the
current implementation of deletion vectors
+in Delta Lake and Iceberg.
+
+## Background
+Deletion vectors feature in table formats like **_Delta Lake_** and **_Apache
Iceberg_** are designed to enhance data
+management efficiency. These vectors allow for the marking of deleted rows
without physically removing them from the
+data files, thereby optimizing operations such as `DELETE`, `UPDATE`, and
`MERGE`. By maintaining a separate files of
+deletions, deletion vectors enable faster query performance and reduce the
need for costly data file rewrites. This
+approach aligns with the merge-on-read paradigm, where changes are merged
during read operations rather than write
+operations, ensuring minimal disruption to ongoing data ingestion processes.
+
+The main types of delete vectors are _**equality deletes**_ and _**position
deletes**_.
+
+#### Equality deletes
+Equality deletes allow for row-level deletions based on specific column values
rather than their positions within a
+file. This means that records in data files matching certain criteria are
considered deleted without altering the
+original data files. This deletion vector type is useful when identifying and
storing the exact position of the row(s)
+is expensive, but the content that identifies the row is clear. This feature
is particularly useful in streaming use
+cases where updates are frequent. Currently, Iceberg supports equality
deletes, however, there is a proposal to replace
+it in future versions.
+
+#### Position deletes
+Position deletes, on the other hand, specify the ordinals of records within a
data file that should be considered
+deleted. This method relies on identifying the exact position within the data
file, thereby providing a more precise and
+direct approach to managing row-level deletions. Position deletes can be
further categorized into two representations:
+simple table representations and compressed bitmap representations.
+
+**_Simple table representations_** involve a straightforward structure
typically stored in Parquet format, containing
+columns like `file_path` and `pos`. The `file_path` indicates the path of the
data file, while `pos` specifies the
+ordinal of the record to be deleted. These files can optionally include the
row data for equality deletes, aiding in
+identifying the exact change which is particularly useful in CDC applications.
+
+A **_compressed bitmap representations_**, leverage a compressed bitmap to
represent deletions. This method allows for a
+more compact and optimized storage of delete vectors, which in turn saves
memory and network bandwidth. The
+_RoaringBitmap_ is a popular choice for this representation. While the bitmaps
are typically stored in separate files,
+in cases where the bitmaps are small because the number of deleted records is
low, they can be embedded within the
+commit logs to optimize the number of network round trips required to load all
the files.
+
+#### Delta Lake and Iceberg
+Currently, Delta Lake supports RoaringBitmaps for deletion vectors, while
Iceberg supports simple table representations.
+The java libraries of Delta Lake and Iceberg provide APIs to read and write
between these representations. Delta Lake
+provides support for deletion vectors for `DELETE` operations since versions
2.4, while `UPDATE` and `MERGE` commands
+default to the copy-on-write mode.
+
+In Delta Lake, the implementation of deletion vectors involves `actions` in
the commit logs, namely `AddFile` and
+`RemoveFile`. When a DML command is executed, it identifies the rows to be
modified and generates a deletion vector,
+typically in a compressed bitmap format like RoaringBitmap. The generated
deletion vector is stored directly
+in the commit logs `json` files (when the number of deletions is minimal).
When stored in separate files, the files
+paths are encoded and referenced in the json files. With each DML operation,
the commit log records the removal of the
+original data file and its addition with updated details including the newly
added deletion vector.
+
+Example:
+```
+Type Path DV Path
+RemoveFile file_a.parquet NULL
+AddFile file_a.parquet file_a_dv_1.bin
+```
+
+If a file with an existing deletion vector is updated again, a new deletion
vector is created, merging the previous and
+new deletions. The commit log records the removal of the old deletion vector
information and the addition of the new
+one.
+
+Example:
+```
+Type Path DV Path
+RemoveFile file_a.parquet file_a_dv_1.bin
+AddFile file_a.parquet file_a_dv_2.bin
+```
+
+When a compaction operation rewrites records into new files, and deletion
vectors associated with data files that are
+removed are also marked as deleted. The commit log records the removal of the
old file and deletion vector and
+the addition of the new compacted file.
+
+Example:
+```
+Type Path DV Path
+RemoveFile file_a.parquet file_a_dv_2.bin
+AddFile file_d.parquet NULL
+```
+
+## Implementation of the conversion of Deletion Vectors in XTable
+On a high level, the implementation of deletion vector conversion in XTable
involves the following steps. XTable must
+first locate the deletion vector, which can be stored either inline within the
commit logs or can be stored in a
+separate deletion vector file whose path is recorded in the commit logs. Once
located, XTable can utilize the utility
+methods provided by table format libraries to stream the ordinals of the
records affected by the deletion operations.
+
+Unlike currently supported conversion of metadata in XTable, deletion vectors
require special handling. While all
+table metadata is present in commit logs and manifest files, deletion vectors
can be stored in separate files. The size
+of the deletion vectors is proportional to the number of records deleted,
which can be significant in large tables.
+The proposed implementation aims to handle this efficiently by streaming the
records from the sources to the targets.
+
+**New Class: `InternalDeletionVector`**
+To improve the functionality, a new class to manage the internal in-memory
representation of deletion vectors is added.
+This is similar to creating internal representation for data files, schema,
and partitioning spec.
+
+The following metadata is relevant for deletion vectors:
+- `dataFilePath`: Path of the data file associated with the deletion vector.
+- `size`: Size of the deletion vector.
+- `countRecordsDeleted`: Number of records deleted.
+- `sourceDeletionVectorFilePath`: Path of the deletion vector file (optional
as deletion vectors can be stored inline).
+- `offset`: Offset of the deletion vector within the file (optional as inline
deletion vectors do not have an offset).
+- `binaryRepresentation`: Binary representation of the inline deletion vector
(optional).
+- `ordinalStream`: Stream of ordinals of the records deleted.
+
+**Modifications in `TableChange`**
+Currently XTable only captures the addition and removal of data files. To
support deletion vectors, the `TableChange`
+class is updated to include the following fields a new field
`deletionVectorsAdded` to capture deletion vectors added
+in a commit. The sources will need to parse the commit logs to extract the
deletion vectors and populate this field.
+
+**Updates in `DeltaActionsConverter`**
+The converter class is responsible for converting Delta Lake `actions` into
internal representations, resulting in
+`TableChange` objects. The converter is updated to extract deletion vectors
from the `AddFile` actions. Currently,
+the converters read content from source metadata files and then close the file
handles. However, this will be different
+for deletion vectors. To avoid the potential memory explosion, the converters
will create a provider of deletion vectors
+ordinals, which will be used to stream the ordinals of the records deleted to
the targets.
+
+**Changes in `DeltaConversionSource`**
+As described above, the delta commit logs adds data file entries in commit log
each time delete vectors are updated.
+This needs special handling in the delta source, to avoid duplication of data
file actions.
+
+## Rollout/Adoption Plan
+
+This change does not cause any breaking changes. Before this change, delete
vectors were ignored and would cause
+incorrect results in the target table. With this change, XTable will be able
to convert the deletion vectors from Delta
+Lake to Iceberg format, ensuring that the target table is consistent with the
source table.
+
+## Test Plan
+
+New tests will be added to verify the conversion of deletion vectors from
Delta Lake to Iceberg format. The integration
+tests will cover generation of position deletes using spark sql, and the
conversion of these deletes to Iceberg format.
\ No newline at end of file