[ https://issues.apache.org/jira/browse/SPARK-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen updated SPARK-4561: ------------------------------ Target Version/s: 1.3.0 (was: 1.2.0) Good point; if we add a {{recursive}} option and have recursion off by default, then it's not urgent to fix this now since the new option will be backwards-compatible with what we ship in 1.2.0. > PySparkSQL's Row.asDict() should convert nested rows to dictionaries > -------------------------------------------------------------------- > > Key: SPARK-4561 > URL: https://issues.apache.org/jira/browse/SPARK-4561 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL > Affects Versions: 1.2.0 > Reporter: Josh Rosen > Assignee: Davies Liu > > In PySpark, you can call {{.asDict > ()}} on a SparkSQL {{Row}} to convert it to a dictionary. Unfortunately, > though, this does not convert nested rows to dictionaries. For example: > {code} > >>> sqlContext.sql("select results from results").first() > Row(results=[Row(time=3.762), Row(time=3.47), Row(time=3.559), > Row(time=3.458), Row(time=3.229), Row(time=3.21), Row(time=3.166), > Row(time=3.276), Row(time=3.239), Row(time=3.149)]) > >>> sqlContext.sql("select results from results").first().asDict() > {u'results': [(3.762,), > (3.47,), > (3.559,), > (3.458,), > (3.229,), > (3.21,), > (3.166,), > (3.276,), > (3.239,), > (3.149,)]} > {code} > Actually, it looks like the nested fields are just left as Rows (IPython's > fancy display logic obscured this in my first example): > {code} > >>> Row(results=[Row(time=1), Row(time=2)]).asDict() > {'results': [Row(time=1), Row(time=2)]} > {code} > Here's the output I'd expect: > {code} > >>> Row(results=[Row(time=1), Row(time=2)]) > {'results' : [{'time': 1}, {'time': 2}]} > {code} > I ran into this issue when trying to use Pandas dataframes to display nested > data that I queried from Spark SQL. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org