Yin Huai created SPARK-9257: ------------------------------- Summary: Fix the false negative of Aggregate2Sort and FinalAndCompleteAggregate2Sort's missingInput Key: SPARK-9257 URL: https://issues.apache.org/jira/browse/SPARK-9257 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Yin Huai
{code} sqlContext.sql( """ |SELECT sum(value) |FROM agg1 |GROUP BY key """.stripMargin).explain() == Physical Plan == Aggregate2Sort Some(List(key#510)), [key#510], [(sum(CAST(value#511, LongType))2,mode=Final,isDistinct=false)], [sum(CAST(value#511, LongType))#1435L], [sum(CAST(value#511, LongType))#1435L AS _c0#1426L] ExternalSort [key#510 ASC], false Exchange hashpartitioning(key#510) Aggregate2Sort None, [key#510], [(sum(CAST(value#511, LongType))2,mode=Partial,isDistinct=false)], [currentSum#1433L], [key#510,currentSum#1433L] ExternalSort [key#510 ASC], false PhysicalRDD [key#510,value#511], MapPartitionsRDD[97] at apply at Transformer.scala:22 sqlContext.sql( """ |SELECT sum(distinct value) |FROM agg1 |GROUP BY key """.stripMargin).explain() == Physical Plan == !FinalAndCompleteAggregate2Sort [key#510,CAST(value#511, LongType)#1446L], [key#510], [(sum(CAST(value#511, LongType)#1446L)2,mode=Complete,isDistinct=false)], [sum(CAST(value#511, LongType))#1445L], [sum(CAST(value#511, LongType))#1445L AS _c0#1438L] Aggregate2Sort Some(List(key#510)), [key#510,CAST(value#511, LongType)#1446L], [key#510,CAST(value#511, LongType)#1446L] ExternalSort [key#510 ASC,CAST(value#511, LongType)#1446L ASC], false Exchange hashpartitioning(key#510) !Aggregate2Sort None, [key#510,CAST(value#511, LongType) AS CAST(value#511, LongType)#1446L], [key#510,CAST(value#511, LongType)#1446L] ExternalSort [key#510 ASC,CAST(value#511, LongType) AS CAST(value#511, LongType)#1446L ASC], false PhysicalRDD [key#510,value#511], MapPartitionsRDD[102] at apply at Transformer.scala:22 {code} For examples shown above, you can see there is a {{!}} at the bingeing of the operator's {{simpleString}}), which indicates that its {{missingInput}} is not empty. Actually, it is a false negative and we need to fix it. Also, it will be good to make these two operators' {{simpleString}} more reader friendly (people can tell what are grouping expressions, what are aggregate functions, and what is the mode of an aggregate function). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org