santoshsb commented on issue #5452: URL: https://github.com/apache/hudi/issues/5452#issuecomment-1117269863
@xiarixiaoyao FYI, the createNewDF code throws the following error `Caused by: java.lang.RuntimeException: java.lang.String is not a valid external type for schema of array<string> at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.ValidateExternalType_2$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.MapObjects_1$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.createNamedStruct_0_1$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.MapObjects_2$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.writeFields_1_3$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$Serializer.apply(ExpressionEncoder.scala:207)` With the following data, Inserted (Schema simplified to highlight the issue), `{ "resourceType": "Patient", "id": "beca9a29-49bb-40e4-adff-4dbb4d664972", "lastUpdated": "2022-02-14T15:18:18.90836+05:30", "source": "4a0701fe-5c3b-482b-895d-875fcbd2148a", "name": [ { "use": "official", "family": "Keeling57", "given": [ "Serina556" ], "prefix": [ "Ms." ] } ] }` Update `{ "resourceType": "Patient", "id": "beca9a29-49bb-40e4-adff-4dbb4d664972", "lastUpdated": "2022-02-14T15:18:18.90836+05:30", "source": "4a0701fe-5c3b-482b-895d-875fcbd2148a", "name": [ { "use": "official", "family": "Keeling57", "given": [ "Serina556" ] } ] }` While updating with the second JSON the prefix is missing and based on the createNewDF it should add that column (verified) with null. Thanks, Santosh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org