Hi,

I am having trouble accessing an array element in JSON data with a
dataframe. Here is the schema:

val json1 = """{"f1":"1", "f1a":[{"f2":"2"}] } }"""
val rdd1 = sc.parallelize(List(json1))
val df1 = sqlContext.read.json(rdd1)
df1.printSchema()

root |-- f1: string (nullable = true) |-- f1a: array (nullable = true) |
|-- element: struct (containsNull = true) | | |-- f2: string (nullable =
true)

I would expect to be able to select the first element of "f1a" this way:
df1.select("f1a[0]").show()

org.apache.spark.sql.AnalysisException: cannot resolve 'f1a[0]' given input
columns f1, f1a;

This is with Spark 1.6.0.

Please help. A follow-up question is: can I access arbitrary levels of
nested JSON array of struct of array of struct?

Thanks,
Xinh

Reply via email to