Hi, I am having trouble accessing an array element in JSON data with a dataframe. Here is the schema:
val json1 = """{"f1":"1", "f1a":[{"f2":"2"}] } }""" val rdd1 = sc.parallelize(List(json1)) val df1 = sqlContext.read.json(rdd1) df1.printSchema() root |-- f1: string (nullable = true) |-- f1a: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- f2: string (nullable = true) I would expect to be able to select the first element of "f1a" this way: df1.select("f1a[0]").show() org.apache.spark.sql.AnalysisException: cannot resolve 'f1a[0]' given input columns f1, f1a; This is with Spark 1.6.0. Please help. A follow-up question is: can I access arbitrary levels of nested JSON array of struct of array of struct? Thanks, Xinh