Matt Zhang created SPARK-56844:
----------------------------------

             Summary: Support ArrayType / MapType / StructType in 
ConstantColumnVector and FileSourceMetadataAttribute
                 Key: SPARK-56844
                 URL: https://issues.apache.org/jira/browse/SPARK-56844
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.2.0
            Reporter: Matt Zhang


`FileSourceMetadataAttribute.isSupportedType` currently rejects ArrayType, 
MapType, and StructType, citing `ColumnVectorUtils.populate` as the limiting 
factor. As a result, file source constant metadata columns cannot use complex 
types, even though all the underlying machinery (`RowToColumnConverter`, 
`OffHeapColumnVector` array/map layout) supports them.

This issue tracks broadening that gate, implementing the missing populate and 
scatter paths in `ColumnVectorUtils.populate` and 
`ConstantColumnVector.writeToOffHeapColumnVector`, and enabling complex 
constants end to end.

Specifically:
- `FileSourceMetadataAttribute.isSupportedType` allows array/map/struct 
recursively, contingent on element types also being supported.
- `ColumnVectorUtils.populate` gains struct/array/map branches. Array and map 
allocate a one-row off-heap backing vector and reuse the existing 
`RowToColumnConverter` to write the constant value with full recursive type 
support.
- `ConstantColumnVector` gains optional ownership of a backing 
`WritableColumnVector` (closed by `close()`), and `writeToOffHeapColumnVector` 
gains array/map branches that copy the M constant elements once into the target 
child vector and write per-row `(offset=0, length=M)`.

No user-facing API changes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to