Seems you hit https://issues.apache.org/jira/browse/SPARK-4296. It has been fixed in 1.2.1 and 1.3.
On Thu, Feb 26, 2015 at 1:22 PM, Yana Kadiyska <yana.kadiy...@gmail.com> wrote: > Can someone confirm if they can run UDFs in group by in spark1.2? > > I have two builds running -- one from a custom build from early December > (commit 4259ca8dd12) which works fine, and Spark1.2-RC2. > > On the latter I get: > > jdbc:hive2://XXX.208:10001> select > from_unixtime(epoch,'yyyy-MM-dd-HH'),count(*) count > . . . . . . . . . . . . . . . . . .> from tbl > . . . . . . . . . . . . . . . . . .> group by > from_unixtime(epoch,'yyyy-MM-dd-HH'); > Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: > Expression not in GROUP BY: > HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFFromUnixTime(epoch#1049L,yyyy-MM-dd-HH) > AS _c0#1004, tree: > Aggregate > [HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFFromUnixTime(epoch#1049L,yyyy-MM-dd-HH)], > > [HiveSimpleUdf#org.apache.hadoop.hive.ql.udf.UDFFromUnixTime(epoch#1049L,yyyy-MM-dd-HH) > AS _c0#1004,COUNT(1) AS count#1003L] > MetastoreRelation default, tbl, None (state=,code=0) > > > > This worked fine on my older build. I don't see a JIRA on this but maybe > I'm not looking right. Can someone please advise? > > > >