[ https://issues.apache.org/jira/browse/SPARK-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated SPARK-2176: ---------------------------- Description: {code} hql("explain select * from src group by key").collect().foreach(println) [ExplainCommand [plan#27:0]] [ Aggregate false, [key#25], [key#25,value#26]] [ Exchange (HashPartitioning [key#25:0], 200)] [ Exchange (HashPartitioning [key#25:0], 200)] [ Aggregate true, [key#25], [key#25]] [ HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None] {code} There are two exchange operators. However, if we do not use explain... {code} hql("select * from src group by key") res4: org.apache.spark.sql.SchemaRDD = SchemaRDD[8] at RDD at SchemaRDD.scala:100 == Query Plan == Aggregate false, [key#8], [key#8,value#9] Exchange (HashPartitioning [key#8:0], 200) Aggregate true, [key#8], [key#8] HiveTableScan [key#8,value#9], (MetastoreRelation default, src, None), None {code} The plan is fine. was: {code} hql("explain select * from src group by key").collect().foreach(println) [ExplainCommand [plan#27:0]] [ Aggregate false, [key#25], [key#25,value#26]] [ Exchange (HashPartitioning [key#25:0], 200)] [ Exchange (HashPartitioning [key#25:0], 200)] [ Aggregate true, [key#25], [key#25]] [ HiveTableScan [key#25,value#26], (MetastoreRelation default, src, None), None] {code} There are two exchange operators. > extra unnecessary exchange operator in group by > ----------------------------------------------- > > Key: SPARK-2176 > URL: https://issues.apache.org/jira/browse/SPARK-2176 > Project: Spark > Issue Type: Bug > Components: SQL > Reporter: Reynold Xin > Assignee: Yin Huai > > {code} > hql("explain select * from src group by key").collect().foreach(println) > [ExplainCommand [plan#27:0]] > [ Aggregate false, [key#25], [key#25,value#26]] > [ Exchange (HashPartitioning [key#25:0], 200)] > [ Exchange (HashPartitioning [key#25:0], 200)] > [ Aggregate true, [key#25], [key#25]] > [ HiveTableScan [key#25,value#26], (MetastoreRelation default, src, > None), None] > {code} > There are two exchange operators. > However, if we do not use explain... > {code} > hql("select * from src group by key") > res4: org.apache.spark.sql.SchemaRDD = > SchemaRDD[8] at RDD at SchemaRDD.scala:100 > == Query Plan == > Aggregate false, [key#8], [key#8,value#9] > Exchange (HashPartitioning [key#8:0], 200) > Aggregate true, [key#8], [key#8] > HiveTableScan [key#8,value#9], (MetastoreRelation default, src, None), None > {code} > The plan is fine. -- This message was sent by Atlassian JIRA (v6.2#6252)