[ https://issues.apache.org/jira/browse/SPARK-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975351#comment-14975351 ]
Yin Huai commented on SPARK-11329: ---------------------------------- We can try {{sqlContext.sql("select max(struct(1, *)) from (select 1 as a, 2 as b) tmp group by a").queryExecution.analyzed}} and plan looks like {code} res11: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan = 'Aggregate ['a], [unresolvedalias('max('struct(1,*)))] Subquery tmp Project [1 AS a#27,2 AS b#28] OneRowRelation$ {code} Looks like [this analysis rule | https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L320-L326] did not work as expected? > Expand Star when creating a struct > ---------------------------------- > > Key: SPARK-11329 > URL: https://issues.apache.org/jira/browse/SPARK-11329 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Yin Huai > > It is pretty common for customers to do regular extractions of update data > from an external datasource (e.g. mysql or postgres). While this is possible > today, the syntax is a little onerous. With some small improvements to the > analyzer I think we could make this much easier. > Goal: Allow users to execute the following two queries as well as their > dataframe equivalents > to find the most recent record for each key > {{SELECT max(struct(timestamp, *)) as mostRecentRecord GROUP BY key}} > to unnest the struct from above. > {{SELECT mostRecentRecord.* FROM data}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org