Re: [PR] [fix](iceberg) Write binary columns with proper Arrow types [doris]

via GitHub Mon, 29 Jun 2026 02:25:42 -0700


github-actions[bot] commented on code in PR #64949:
URL: https://github.com/apache/doris/pull/64949#discussion_r3490606269



##########
be/src/format/table/iceberg/arrow_schema_util.cpp:
##########
@@ -75,13 +75,24 @@ Status ArrowSchemaUtil::convert_to(const 
iceberg::NestedField& field,
         break;
     }
 
-    case iceberg::TypeID::BINARY:
     case iceberg::TypeID::STRING:
-    case iceberg::TypeID::UUID:
-    case iceberg::TypeID::FIXED:
         arrow_type = arrow::utf8();
         break;
 
+    case iceberg::TypeID::BINARY:
+        arrow_type = arrow::binary();
+        break;
+
+    case iceberg::TypeID::UUID:
+        arrow_type = arrow::fixed_size_binary(16);

Review Comment:
   This makes UUID writes fail for the default Iceberg catalog mapping. FE 
still exposes Iceberg `uuid` columns as Doris `STRING` unless 
`enable.mapping.varbinary` is set, and `IcebergTableSink` sends the original 
Iceberg schema JSON to BE. With this change BE builds an Arrow 
`fixed_size_binary(16)` field, then `convert_to_arrow_batch` calls the 
`DataTypeStringSerDe` for the Doris string column. A normal UUID value like 
`550e8400-e29b-41d4-a716-446655440000` is 36 bytes, so the new fixed-size 
branch returns `InvalidArgument("Fixed size binary column expects 16 bytes, got 
36")` instead of writing the row. Please either convert canonical UUID strings 
to the 16-byte Iceberg representation before appending, or keep this mapping 
aligned with the Doris column type / varbinary catalog setting. An end-to-end 
BE test that writes a UUID-valued block through the Iceberg schema would catch 
this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [fix](iceberg) Write binary columns with proper Arrow types [doris]

Reply via email to