Hi All,

I have been using spark-xml <https://github.com/databricks/spark-xml> in
one of my project to process some xml files. but when I don't provide
custom schema, this jar automatically generate following schema:-
root
 |-- UserData: struct (nullable = true)
 |    |-- UserValue: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- type: string (nullable = true)
 |-- id: string (nullable = true)
 |-- instanceRefs: string (nullable = true)
 |-- name: string (nullable = true)


But this is not the correct schema, what i want is something like this:-

root
 |-- UserData: struct (nullable = true)
 |    |-- UserValue: array (nullable = true)
 |    |    |-- title: string (nullable = true)
 |    |    |-- value: string (nullable = true))
 |    |-- type: string (nullable = true)
 |-- id: string (nullable = true)
 |-- instanceRefs: string (nullable = true)
 |-- name: string (nullable = true)

I have been trying to provide custom schema but it is always saying that
field 'title' is not present in the schema.
I tried to change datatype and structure but its not working. am i missing
something or there is some bug with spark-xml for nested structure???

Thanks and Regards,
Nandan

Reply via email to