[ https://issues.apache.org/jira/browse/SPARK-22266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan reassigned SPARK-22266: ----------------------------------- Assignee: Maryann Xue > The same aggregate function was evaluated multiple times > -------------------------------------------------------- > > Key: SPARK-22266 > URL: https://issues.apache.org/jira/browse/SPARK-22266 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.2.0 > Reporter: Maryann Xue > Assignee: Maryann Xue > Priority: Minor > Fix For: 2.3.0 > > > We should avoid the same aggregate function being evaluated more than once, > and this is what has been stated in the code comment below > (patterns.scala:206). However things didn't work as expected. > {code} > // A single aggregate expression might appear multiple times in > resultExpressions. > // In order to avoid evaluating an individual aggregate function > multiple times, we'll > // build a set of the distinct aggregate expressions and build a > function which can > // be used to re-write expressions so that they reference the single > copy of the > // aggregate function which actually gets computed. > {code} > For example, the physical plan of > {code} > SELECT a, max(b+1), max(b+1) + 1 FROM testData2 GROUP BY a > {code} > was > {code} > HashAggregate(keys=[a#23], functions=[max((b#24 + 1)), max((b#24 + 1))], > output=[a#23, max((b + 1))#223, (max((b + 1)) + 1)#224]) > +- HashAggregate(keys=[a#23], functions=[partial_max((b#24 + 1)), > partial_max((b#24 + 1))], output=[a#23, max#231, max#232]) > +- SerializeFromObject [assertnotnull(input[0, > org.apache.spark.sql.test.SQLTestData$TestData2, true]).a AS a#23, > assertnotnull(input[0, org.apache.spark.sql.test.SQLTestData$TestData2, > true]).b AS b#24] > +- Scan ExternalRDDScan[obj#22] > {code} > , where in each HashAggregate there were two identical aggregate functions > "max(b#24 + 1)". -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org