You need to use lateral view explode: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView
On Fri, Jan 23, 2015 at 7:02 AM, matthes <mdiekst...@sensenetworks.com> wrote: > I try to work with nested parquet data. To read and write the parquet file > is > actually working now but when I try to query a nested field with SqlContext > I get an exception: > > RuntimeException: "Can't access nested field in type > ArrayType(StructType(List(StructField(..." > > I generate the parquet file by parsing the data into the following > caseclass > structure: > > case class areas(area : String, dates : Seq[Int]) > case class dataset(userid : Long, source : Int, days : Seq[Int] , areas : > Seq[areas] ) > > automatic generated schema: > root > |-- userid: long (nullable = false) > |-- source: integer (nullable = false) > |-- days: array (nullable = true) > | |-- element: integer (containsNull = false) > |-- areas: array (nullable = true) > | |-- element: struct (containsNull = true) > | | |-- area: string (nullable = true) > | | |-- dates: array (nullable = true) > | | | |-- element: integer (containsNull = false) > > After writeing the Parquetfile I load the data again and I create a > SQLContext and try to execute a sql-command like follows: > > parquetFile.registerTempTable("testtable") > val result = sqlContext.sql("SELECT areas.area FROM testtable where userid > > > 500000") > result.map(t => t(0)).collect().foreach(println) // throw the exception > > If I execute this command: val result = sqlContext.sql("SELECT > areas[0].area > FROM testtable where userid > 500000") > I get only the values at the first position in the array but I need every > value and that doesn't work. > I sow the function t.getAs[...] but everything what I tried didn't worked. > > I hope somebody can help me how I can access a nested field that I read all > values of the nested array or isn't it supported? > > I use spark-sql_2.10(v1.2.0), spark-core_2.10(v1.2.0) and parquet 1.6.0rc4. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-access-nested-types-with-sql-tp21336.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >