[
https://issues.apache.org/jira/browse/SPARK-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14619643#comment-14619643
]
Michael Armbrust commented on SPARK-5092:
-
Thanks for the comment. However, in most use cases I have seen, people are
using SQL to unnest the specific things they want. Either way, we are pretty
much forced to stick with what we do now (and what hive does) to maintain
compatibility.
Selecting from a nested structure with SparkSQL should return a nested
structure
Key: SPARK-5092
URL: https://issues.apache.org/jira/browse/SPARK-5092
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 1.2.0
Reporter: Brad Willard
Priority: Minor
Labels: pyspark, spark, sql
When running a sparksql query like this (at least on a json dataset)
select
rid,
meta_data.name
from
a_table
The rows returned lose the nested structure. I receive a row like
Row(rid='123', name='delete')
instead of
Row(rid='123', meta_data=Row(name='data'))
I personally think this is confusing especially when programmatically
building and executing queries and then parsing it to find your data in a new
structure. I could understand how that's less desirable in some situations,
but you could get around it by supporting 'as'. If you wanted to skip the
nested structure simply write.
select
rid,
meta_data.name as name
from
a_table
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org