Hi,
I have the following schema. And I am trying to put the structure below in
a data frame or dataset such that each in field inside a struct is a column
in a data frame.
I tried to follow this link
<http://stackoverflow.com/questions/38753898/how-to-flatten-a-struct-in-a-spark-dataframe>
and
did the following.
Dataset<Row> df = ds.select(functions.from_json(new Column("value").cast(
"string"), getSchema()).as("payload"));
Dataset<Row> df1 = df.select(df.col("payload.info"));
df1.printSchema();
root
|-- info: struct (nullable = true)
| |-- index: string (nullable = true)
| |-- type: string (nullable = true)
| |-- id: string (nullable = true)
| |-- name: string (nullable = true)
| |-- number: integer (nullable = true)
However I get the following
+--------------------+
| info|
+--------------------+
|[,mango,,fruit...|
|[,apple,,fruit...|
I just want the data frame in the format below. any ideas?
index | type | id | name | number
Thanks!