Hi All, I have been using spark-xml <https://github.com/databricks/spark-xml> in one of my project to process some xml files. but when I don't provide custom schema, this jar automatically generate following schema:- root |-- UserData: struct (nullable = true) | |-- UserValue: array (nullable = true) | | |-- element: string (containsNull = true) | |-- type: string (nullable = true) |-- id: string (nullable = true) |-- instanceRefs: string (nullable = true) |-- name: string (nullable = true)
But this is not the correct schema, what i want is something like this:- root |-- UserData: struct (nullable = true) | |-- UserValue: array (nullable = true) | | |-- title: string (nullable = true) | | |-- value: string (nullable = true)) | |-- type: string (nullable = true) |-- id: string (nullable = true) |-- instanceRefs: string (nullable = true) |-- name: string (nullable = true) I have been trying to provide custom schema but it is always saying that field 'title' is not present in the schema. I tried to change datatype and structure but its not working. am i missing something or there is some bug with spark-xml for nested structure??? Thanks and Regards, Nandan