adriangb commented on code in PR #19955:
URL: https://github.com/apache/datafusion/pull/19955#discussion_r2729323992


##########
docs/source/library-user-guide/upgrading.md:
##########
@@ -118,6 +118,78 @@ let context = SimplifyContext::default()
 
 See [`SimplifyContext` 
documentation](https://docs.rs/datafusion-expr/latest/datafusion_expr/simplify/struct.SimplifyContext.html)
 for more details.
 
+### Struct Casting Now Requires Field Name Overlap
+
+DataFusion's struct casting mechanism previously allowed casting between 
structs with differing field names if the field counts matched. This 
"positional fallback" behavior could silently misalign fields and cause data 
corruption.
+
+**Breaking Change:**
+
+Starting with DataFusion 53.0.0, struct casts now require **at least one 
overlapping field name** between the source and target structs. Casts without 
field name overlap are rejected at plan time with a clear error message.
+
+**Who is affected:**
+
+- Applications that cast between structs with no overlapping field names
+- Queries that rely on positional struct field mapping (e.g., casting 
`struct(x, y)` to `struct(a, b)` based solely on position)
+- Code that constructs or transforms struct columns programmatically
+
+**Migration guide:**
+
+If you encounter an error like:
+
+```text
+Cannot cast struct with 2 fields to 2 fields because there is no field name 
overlap
+```
+
+You must explicitly rename or map fields to ensure at least one field name 
matches. Here are common patterns:
+
+**Example 1: Rename fields in the target schema to match source names**
+
+**Before (would fail now):**
+
+```sql
+-- This would previously succeed by mapping positionally: x→a, y→b
+SELECT CAST(source_col AS STRUCT<a INT, b INT>) FROM table1;
+```
+
+**After (must align names):**
+
+```sql
+-- Explicitly rename to match source field names
+SELECT CAST(source_col AS STRUCT<x INT, y INT>) FROM table1;
+
+-- OR use a struct constructor with explicit field names
+SELECT STRUCT_CONSTRUCT(
+    'x', source_col.x,
+    'y', source_col.y
+) FROM table1;
+```
+
+**Example 2: Using struct constructors to rebind fields**
+
+If you need to map fields by position, use explicit struct construction:
+
+```rust,ignore
+// Rust API: Build the target struct explicitly
+let source_array = /* ... */;
+let target_field = Field::new("target_col",
+    DataType::Struct(vec![
+        FieldRef::new("new_a", DataType::Int32),
+        FieldRef::new("new_b", DataType::Utf8),
+    ]));
+
+// Don't rely on casting—construct directly
+// Use struct builders or row constructors that preserve your mapping logic
+```
+
+**Why this change:**

Review Comment:
   I think it's worth mentioning that this matches DuckDBs behavior



##########
docs/source/library-user-guide/upgrading.md:
##########
@@ -118,6 +118,78 @@ let context = SimplifyContext::default()
 
 See [`SimplifyContext` 
documentation](https://docs.rs/datafusion-expr/latest/datafusion_expr/simplify/struct.SimplifyContext.html)
 for more details.
 
+### Struct Casting Now Requires Field Name Overlap
+
+DataFusion's struct casting mechanism previously allowed casting between 
structs with differing field names if the field counts matched. This 
"positional fallback" behavior could silently misalign fields and cause data 
corruption.
+
+**Breaking Change:**
+
+Starting with DataFusion 53.0.0, struct casts now require **at least one 
overlapping field name** between the source and target structs. Casts without 
field name overlap are rejected at plan time with a clear error message.
+
+**Who is affected:**
+
+- Applications that cast between structs with no overlapping field names
+- Queries that rely on positional struct field mapping (e.g., casting 
`struct(x, y)` to `struct(a, b)` based solely on position)
+- Code that constructs or transforms struct columns programmatically
+
+**Migration guide:**
+
+If you encounter an error like:
+
+```text
+Cannot cast struct with 2 fields to 2 fields because there is no field name 
overlap
+```
+
+You must explicitly rename or map fields to ensure at least one field name 
matches. Here are common patterns:
+
+**Example 1: Rename fields in the target schema to match source names**
+
+**Before (would fail now):**
+
+```sql
+-- This would previously succeed by mapping positionally: x→a, y→b
+SELECT CAST(source_col AS STRUCT<a INT, b INT>) FROM table1;
+```
+
+**After (must align names):**
+
+```sql
+-- Explicitly rename to match source field names
+SELECT CAST(source_col AS STRUCT<x INT, y INT>) FROM table1;

Review Comment:
   Is this something other engines support? I would have guessed this fails 
under the "no name overlap" rule.



##########
datafusion/common/src/nested_struct.rs:
##########
@@ -323,7 +306,11 @@ fn validate_field_compatibility(
     Ok(())
 }
 
-fn has_one_of_more_common_fields(
+/// Check if two field lists have at least one common field by name.
+///
+/// This is useful for validating struct compatibility when casting between 
structs,
+/// ensuring that source and target fields have overlapping names.
+pub fn has_one_of_more_common_fields(

Review Comment:
   Does this need to be `pub`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to