and124578963 opened a new issue, #12467:
URL: https://github.com/apache/iceberg/issues/12467
### Apache Iceberg version
1.8.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
## Description
When executing a sequence of deletes (position deletes followed by equality
deletes and a final row deletion) in Copy-on-Write (COW) mode, the equality
deletes are not applied to the original data files, resulting in residual data
that should have been removed.
**Observed Context**:
- The issue **does not occur** when combining `UPDATE` and `MERGE`
operations in COW mode – these work as expected.
- The problem is **specific to COW**; Merge-on-Read (MOR) mode handles the
same scenario correctly.
## Steps to Reproduce
### 1) Data Setup:
- **Create two Parquet data files:**
- `data-file-1.parquet`: IDs `[1, 2, 3, 4, 5]`
- `data-file-2.parquet`: IDs `[6, 7, 8, 9, 10]`
- Configure the table with COW semantics (`write.delete.mode =
copy-on-write`).
### 2) Apply Initial Deletes:
- Add a **position delete file** to remove:
- Row 0 (ID `1`) from `data-file-1.parquet`
- Row 0 (ID `6`) from `data-file-2.parquet`
- Add an **equality delete file** targeting IDs [`3, 4, 5, 6, 7, 8, 9, 10`].
### 3) Execute Final Delete Command:
`DELETE FROM table WHERE id = 2; -- Targets remaining ID '2' `
### Expected Result
After all deletions:
- `SELECT * FROM table` should return no rows, as:
- Position deletes remove IDs `1` and `6`.
- Equality deletes remove IDs `3, 4, 5` (from `data-file-1`) and `7, 8,
9, 10` (from `data-file-2`).
- Final `DELETE WHERE id = 2` removes the last remaining ID (`2`).
### Actual Result
`SELECT * FROM table` returns IDs `3, 4, 5`.
- **Observed Issues:**
The equality deletes targeting `3, 4, 5` (in `data-file-1`) are not applied.
The `DELETE WHERE id = 2` operation only removes ID `2`, leaving `3, 4, 5`
intact.
## Environment
Apache Iceberg Versions: `1.6.1`, `1.8.1`
## Tests
Example of tests:
https://github.com/apache/iceberg/compare/1.8.x...and124578963:iceberg:1.8.x
To run:
`./gradlew :iceberg-spark:iceberg-spark-3.5_2.12:test --tests
"org.apache.iceberg.spark.TestSparkExecutionWithEqualityAndPositionDeletes"`
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [x] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]