longvu-db opened a new pull request, #55480: URL: https://github.com/apache/spark/pull/55480
### What changes were proposed in this pull request? Adds 14 new tests to improve column ID validation test coverage for the DSv2 `Column.id()` feature introduced in #55376. The new tests are inspired by patterns from Delta's `DeltaColumnMappingSuite` and `DeltaColumnMappingSuiteEdge`. **New tests in `DataSourceV2DataFrameSuite` (11 tests):** 1. **DataFrame operation types**: Ensures column ID validation fires for filter, aggregate, sort, and select operations (not just plain `collect()`). 2. **Subquery**: Validates that `transformWithSubqueries` in the refresh logic correctly validates column IDs in subquery plans. 3. **Rename column interaction**: Verifies rename triggers `COLUMNS_MISMATCH` (schema validation), not `COLUMN_ID_MISMATCH`, because the old column name is no longer found in the current table. 4. **Sequential schema changes**: Tests double drop+re-add cycles and mixed changes (drop+re-add one column while adding another). 5. **Type widening in standard catalog**: Tests that the standard `InMemoryTableCatalog` assigns new column IDs when types change (unlike `TypeChangePreservesColIdTableCatalog`). 6. **insertInto write path**: Extends write-path coverage beyond `writeTo().append()`. 7. **Column ID assignment verification**: Direct unit test verifying IDs are unique, incrementing, and preserved for unchanged columns across schema changes. **New tests in `DataSourceV2ExtSessionColumnIdSuite` (3 tests):** 8. **External add column (positive)**: Verifies that an external session adding a column does not break existing DataFrames. 9. **External multi-column mismatch**: Tests that multiple column changes across sessions are all detected. 10. **External type widening**: Tests cross-session type change detection. ### Why are the changes needed? The original PR (#55376) focused on the core detection scenarios. These additional tests verify correctness across a broader set of DataFrame operations, error classifications, sequential schema changes, and cross-session patterns, all inspired by Delta's comprehensive column mapping test suite. ### Does this PR introduce _any_ user-facing change? No. Test-only changes. ### How was this patch tested? New test cases added to existing test suites. ### Was this patch authored or co-authored using generative AI tooling? Yes. This PR depends on #55376. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
