Is there any plans of supporting JSON arrays more fully? Take for example: val myJson = sqlContext.jsonRDD(List("""{"foo":[{"bar":1},{"baz":2}]}""")) myJson.registerTempTable("JsonTest")
I would like a way to pull out parts of the array data based on a key sql("""SELECT foo["bar"] FROM JsonTest""") //projects only the object with bar, the rest would be null I could even work around this if there was some way to access the key name from the SchemaRDD: myJson.filter(x=>x(0).asInstanceOf[Seq[Row]].exists(y=>y.key == "bar")) .map(x=>x(0).asInstanceOf[Seq[Row]].filter(y=>y.key == "bar")) //This does the same as above, except also filtering out those without a bar key This is the closest suggestion I could find thus far, <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView> which still does not solve the problem of pulling out the keys. I tried with a UDF also, but could not currently make that work either. If there isn't anything in the works, then would it be appropriate to create a ticket for this? Thanks, Justin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-JSON-array-support-tp21939.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org