Hi! I create DataFrame using method following JavaRDD<Row> rows = ... StructType structType = ... Then apply sqlContext.createDataFrame(rows, structType).
I have pretty complex schema: root |-- Id: long (nullable = true) |-- attributes: struct (nullable = true) | |-- FirstName: array (nullable = true) | | |-- element: string (containsNull = true) | |-- Identifiers: array (nullable = true) | | |-- element: struct (containsNull = true) | | | |-- Type: array (nullable = true) | | | | |-- element: string (containsNull = true) The question is when I explode attributes.Identifiers column there is one more field appear in the schema: |-- Identifiers: string (nullable = true) The question is: why the type of Identifiers is string? Is it possible to make it nonString? In the given example it’s clear that the schema must be a struct<array<string>>. And unfortunately it’s not possible to cast this column as cast string to struct is not allowed. Are there any workarounds to have correct schema? Thanks in advance. Eugene Morozov [email protected]
