[ https://issues.apache.org/jira/browse/SPARK-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cheng Lian updated SPARK-6450: ------------------------------ Summary: MetastoreRelation.equals doesn't compare output attributes (was: f) > MetastoreRelation.equals doesn't compare output attributes > ---------------------------------------------------------- > > Key: SPARK-6450 > URL: https://issues.apache.org/jira/browse/SPARK-6450 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.3.0 > Reporter: Anand Mohan Tumuluri > Assignee: Michael Armbrust > Priority: Blocker > > The below query was working fine till 1.3 commit > 9a151ce58b3e756f205c9f3ebbbf3ab0ba5b33fd.(Yes it definitely works at this > commit although this commit is completely unrelated) > It got broken in 1.3.0 release with an AnalysisException: resolved attributes > ... missing from .... (although this list contains the fields which it > reports missing) > {code} > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:189) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) > at > org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) > at > org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493) > at > org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) > at com.sun.proxy.$Proxy17.executeStatementAsync(Unknown Source) > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > {code} > select Orders.Country, Orders.ProductCategory,count(1) from Orders join > (select Orders.Country, count(1) CountryOrderCount from Orders where > to_date(Orders.PlacedDate) > '2015-01-01' group by Orders.Country order by > CountryOrderCount DESC LIMIT 5) Top5Countries on Top5Countries.Country = > Orders.Country where to_date(Orders.PlacedDate) > '2015-01-01' group by > Orders.Country,Orders.ProductCategory; > {code} > The temporary workaround is to add explicit alias for the table Orders > {code} > select o.Country, o.ProductCategory,count(1) from Orders o join (select > r.Country, count(1) CountryOrderCount from Orders r where > to_date(r.PlacedDate) > '2015-01-01' group by r.Country order by > CountryOrderCount DESC LIMIT 5) Top5Countries on Top5Countries.Country = > o.Country where to_date(o.PlacedDate) > '2015-01-01' group by > o.Country,o.ProductCategory; > {code} > However this change not only affects self joins, it also seems to affect > union queries as well, like the below query which was again working > before(commit 9a151ce) got broken > {code} > select Orders.Country,null,count(1) OrderCount from Orders group by > Orders.Country,null > union all > select null,Orders.ProductCategory,count(1) OrderCount from Orders group by > null, Orders.ProductCategory > {code} > also fails with a Analysis exception. > The workaround is to add different aliases for the tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org