The GitHub Actions job "CI" on iceberg-rust.git/main has failed. Run started by GitHub user liurenjie1024 (triggered by liurenjie1024).
Head commit for run: 1384a4f2d71ed16b73f3b1f139d5dbd4e5035428 / Gerald Berger <[email protected]> feat(core): Add support for `_file` column (#1824) ## Which issue does this PR close? - Closes #1766. ## What changes are included in this PR? Integrates virtual field handling for the `_file` metadata column into `RecordBatchTransformer` using a pre-computed constants map, eliminating post-processing and duplicate lookups. ## Key Changes **New `metadata_columns.rs` module**: Centralized utilities for metadata columns - Constants: `RESERVED_FIELD_ID_FILE`, `RESERVED_COL_NAME_FILE` - Helper functions: `get_metadata_column_name()`, `get_metadata_field_id()`, `is_metadata_field()`, `is_metadata_column_name()` **Enhanced `RecordBatchTransformer`**: - Added `constant_fields: HashMap<i32, (DataType, PrimitiveLiteral)>` - pre-computed during initialization - New `with_constant()` method - computes Arrow type once during setup - Updated to use pre-computed types and values (avoids duplicate lookups) - Handles `DataType::RunEndEncoded` for constant strings (memory efficient) **Simplified `reader.rs`**: - Pass full `project_field_ids` (including virtual) to RecordBatchTransformer - Single `with_constant()` call to register `_file` column - Removed post-processing loop **Updated `scan/mod.rs`**: - Use `is_metadata_column_name()` and `get_metadata_field_id()` instead of hardcoded checks ## Are these changes tested? Yes, comprehensive tests have been added to verify the functionality: ### New Tests (7 tests added) #### Table Scan API Tests (7 tests) 1. **`test_select_with_file_column`** - Verifies basic functionality of selecting `_file` with regular columns 2. **`test_select_file_column_position`** - Verifies column ordering is preserved 3. **`test_select_file_column_only`** - Tests selecting only the `_file` column 4. **`test_file_column_with_multiple_files`** - Tests multiple data files scenario 5. **`test_file_column_at_start`** - Tests `_file` at position 0 6. **`test_file_column_at_end`** - Tests `_file` at the last position 7. **`test_select_with_repeated_column_names`** - Tests repeated column selection Report URL: https://github.com/apache/iceberg-rust/actions/runs/20060746531 With regards, GitHub Actions via GitBox
