Sönke Liebau created NIFI-16069:
-----------------------------------
Summary: PutIcebergRecord fails with ClassCastException when
writing complex types (arrays, maps, nested records)
Key: NIFI-16069
URL: https://issues.apache.org/jira/browse/NIFI-16069
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 2.10.0
Reporter: Sönke Liebau
PutIcebergRecord fails to write FlowFiles whose schema contains complex/nested
types like Iceberg list, map, or struct columns. RecordConverter only
translates top-level scalar values (java.sql timestamp/date/time -> java.time)
and passes complex values through unchanged.
As a result, values reach Iceberg's Parquet writer in NiFi's native
representation, which is incompatible with what Iceberg expects:
* Nested records arrive as org.apache.nifi.serialization.record.MapRecord but
Iceberg requires org.apache.iceberg.StructLike.
* Array fields arrive as Object[] but Iceberg's writer requires a
java.util.Collection.
* Maps and elements/values nested inside these types are likewise not
converted (e.g. a date inside an array or map value).
Because conversion is gated on scalar field types only, records consisting
solely of complex fields skipp conversion entirely.
h3. Steps to reproduce
# Create an Iceberg table with a complex column, e.g. a struct (nested
record), a list<...> (array), or a map<...>.
# Configure a PutIcebergRecord processor pointing at that table with a
matching record reader schema.
# Send a FlowFile containing a record with a value for the complex column
(e.g. a nested record for the struct, or an array for the list).
# Observed: the FlowFile routes to failure with a ClassCastException
(MapRecord -> StructLike for structs, Object[] -> Collection for arrays).
# Expected: the record is written to the Iceberg table successfully.
h3. Root cause
RecordConverter performs only shallow, scalar-only conversion and has no
knowledge of the target Iceberg types, so complex values are passed through in
NiFi's native form.
h3. Proposed Fix
Make RecordConverter recursive and Iceberg-schema-aware. Convert
arrays/collections to List, wrap nested records as Iceberg StructLike
(DelegatedRecord), and convert maps, recursing into element/key/value types so
scalar conversions still apply at any depth.
DelegatedRecord now passes the target StructType into the converter.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)