[ https://issues.apache.org/jira/browse/SPARK-12725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15125587#comment-15125587 ]
Cheng Lian commented on SPARK-12725: ------------------------------------ There are other analysis rules that may use generated attributes (e.g., {{DistinctAggregationRewriter}}). I think a generic approach is better than special casing them one by one. > SQL generation suffers from name conficts introduced by some analysis rules > --------------------------------------------------------------------------- > > Key: SPARK-12725 > URL: https://issues.apache.org/jira/browse/SPARK-12725 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Cheng Lian > > Some analysis rules generate auxiliary attribute references with the same > name but different expression IDs. For example, {{ResolveAggregateFunctions}} > introduces {{havingCondition}} and {{aggOrder}}, and > {{DistinctAggregationRewriter}} introduces {{gid}}. > This is OK for normal query execution since these attribute references get > expression IDs. However, it's troublesome when converting resolved query > plans back to SQL query strings since expression IDs are erased. > Here's an example Spark 1.6.0 snippet for illustration: > {code} > sqlContext.range(10).select('id as 'a, 'id as 'b).registerTempTable("t") > sqlContext.sql("SELECT SUM(a) FROM t GROUP BY a, b ORDER BY COUNT(a), > COUNT(b)").explain(true) > {code} > The above code produces the following resolved plan: > {noformat} > == Analyzed Logical Plan == > _c0: bigint > Project [_c0#101L] > +- Sort [aggOrder#102L ASC,aggOrder#103L ASC], true > +- Aggregate [a#47L,b#48L], [(sum(a#47L),mode=Complete,isDistinct=false) > AS _c0#101L,(count(a#47L),mode=Complete,isDistinct=false) AS > aggOrder#102L,(count(b#48L),mode=Complete,isDistinct=false) AS aggOrder#103L] > +- Subquery t > +- Project [id#46L AS a#47L,id#46L AS b#48L] > +- LogicalRDD [id#46L], MapPartitionsRDD[44] at range at > <console>:26 > {noformat} > Here we can see that both aggregate expressions in {{ORDER BY}} are extracted > into an {{Aggregate}} operator, and both of them are named {{aggOrder}} with > different expression IDs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org