The GitHub Actions job "Python CI" on iceberg-python.git/main has succeeded. Run started by GitHub user kevinjqliu (triggered by kevinjqliu).
Head commit for run: 2d6a1b97bd0facc8000377b98373eff6433c47dc / geruh <[email protected]> feat: Add DeleteFileIndex to improve position delete lookup (#2918) Related to #2255. # Rationale for this change This PR is a piece of the existing DFI PR in #2255. However, this rips out the existing delete->data matching behavior for deletes and indexes them for efficient lookup. The previous implementation: 1. Scanned all delete files with sequence number >= data file's sequence number 2. Created a new `_InclusiveMetricsEvaluator` instance for each data file 3. Evaluated every candidate delete file against the data file's path Now we extend this workflow with a `DeleteFileIndex` that: - INdexes path specific DVs - Indexes partition-scoped deletes by (spec_id, partition record) - Uses bisect_left for sequence number filtering This aligns with the Java implementation of the [DeleteFileIndex](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/DeleteFileIndex.java), following the python infra. ## Are these changes tested? New tests added and existing tests continue to pass ## Are there any user-facing changes? No Report URL: https://github.com/apache/iceberg-python/actions/runs/21267255495 With regards, GitHub Actions via GitBox
