I try to work with nested parquet data. To read and write the parquet file is
actually working now but when I try to query a nested field with SqlContext
I get an exception:
RuntimeException: "Can't access nested field in type
ArrayType(StructType(List(StructField(..."
I generate the parquet file by parsing the data into the following caseclass
structure:
case class areas(area : String, dates : Seq[Int])
case class dataset(userid : Long, source : Int, days : Seq[Int] , areas :
Seq[areas] )
automatic generated schema:
root
|-- userid: long (nullable = false)
|-- source: integer (nullable = false)
|-- days: array (nullable = true)
| |-- element: integer (containsNull = false)
|-- areas: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- area: string (nullable = true)
| | |-- dates: array (nullable = true)
| | | |-- element: integer (containsNull = false)
After writeing the Parquetfile I load the data again and I create a
SQLContext and try to execute a sql-command like follows:
parquetFile.registerTempTable("testtable")
val result = sqlContext.sql("SELECT areas.area FROM testtable where userid >
500000")
result.map(t => t(0)).collect().foreach(println) // throw the exception
If I execute this command: val result = sqlContext.sql("SELECT areas[0].area
FROM testtable where userid > 500000")
I get only the values at the first position in the array but I need every
value and that doesn't work.
I sow the function t.getAs[...] but everything what I tried didn't worked.
I hope somebody can help me how I can access a nested field that I read all
values of the nested array or isn't it supported?
I use spark-sql_2.10(v1.2.0), spark-core_2.10(v1.2.0) and parquet 1.6.0rc4.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-access-nested-types-with-sql-tp21336.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]