Re: TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY
Aha, that makes sense. Thanks for the response! I guess one of the areas Spark could need some love in in error messages (: On Fri, Jul 18, 2014 at 9:41 PM, Michael Armbrust wrote: > Sorry for the non-obvious error message. It is not valid SQL to include > attributes in the select clause unless they are also in the group by clause > or are inside of an aggregate function. > > On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" > wrote: >> >> Hi again! >> >> I am having problems when using GROUP BY on both SQLContext and >> HiveContext (same problem). >> >> My code (simplified as much as possible) can be seen here: >> http://pastebin.com/33rjW67H >> >> In short, I'm getting data from a Cassandra store with Datastax' new >> driver (which works great by the way, recommended!), and mapping it to >> a Spark SQL table through a Product class (Dokument in the source). >> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY, >> I get the following error: >> >> Exception in thread "main" org.apache.spark.SparkException: Job >> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent >> failure: Exception failure in TID 63 on host 192.168.121.132: >> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No >> function to evaluate expression. type: AttributeReference, tree: id#0 >> >> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158) >> >> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64) >> >> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195) >> >> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174) >> scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> scala.collection.Iterator$class.foreach(Iterator.scala:727) >> scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> >> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >> >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >> >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >> >> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >> scala.collection.AbstractIterator.to(Iterator.scala:1157) >> >> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >> scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> >> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >> scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) >> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) >> >> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) >> >> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) >> >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112) >> org.apache.spark.scheduler.Task.run(Task.scala:51) >> >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> java.lang.Thread.run(Thread.java:745) >> >> What am I doing wrong? >> >> -- >> Best regards, >> Martin Gammelsæter -- Mvh. Martin Gammelsæter 92209139
Re: TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY
Sorry for the non-obvious error message. It is not valid SQL to include attributes in the select clause unless they are also in the group by clause or are inside of an aggregate function. On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" wrote: > Hi again! > > I am having problems when using GROUP BY on both SQLContext and > HiveContext (same problem). > > My code (simplified as much as possible) can be seen here: > http://pastebin.com/33rjW67H > > In short, I'm getting data from a Cassandra store with Datastax' new > driver (which works great by the way, recommended!), and mapping it to > a Spark SQL table through a Product class (Dokument in the source). > Regular SELECTs and stuff works fine, but once I try to do a GROUP BY, > I get the following error: > > Exception in thread "main" org.apache.spark.SparkException: Job > aborted due to stage failure: Task 0.0:25 failed 4 times, most recent > failure: Exception failure in TID 63 on host 192.168.121.132: > org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No > function to evaluate expression. type: AttributeReference, tree: id#0 > > org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158) > > org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64) > > org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195) > > org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174) > scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > scala.collection.Iterator$class.foreach(Iterator.scala:727) > scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > > scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > scala.collection.TraversableOnce$class.to > (TraversableOnce.scala:273) > scala.collection.AbstractIterator.to(Iterator.scala:1157) > > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) > > scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) > scala.collection.AbstractIterator.toArray(Iterator.scala:1157) > org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) > org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) > > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) > > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112) > org.apache.spark.scheduler.Task.run(Task.scala:51) > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:745) > > What am I doing wrong? > > -- > Best regards, > Martin Gammelsæter >
TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY
Hi again! I am having problems when using GROUP BY on both SQLContext and HiveContext (same problem). My code (simplified as much as possible) can be seen here: http://pastebin.com/33rjW67H In short, I'm getting data from a Cassandra store with Datastax' new driver (which works great by the way, recommended!), and mapping it to a Spark SQL table through a Product class (Dokument in the source). Regular SELECTs and stuff works fine, but once I try to do a GROUP BY, I get the following error: Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0:25 failed 4 times, most recent failure: Exception failure in TID 63 on host 192.168.121.132: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158) org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64) org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195) org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174) scala.collection.Iterator$$anon$11.next(Iterator.scala:328) scala.collection.Iterator$class.foreach(Iterator.scala:727) scala.collection.AbstractIterator.foreach(Iterator.scala:1157) scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) scala.collection.AbstractIterator.to(Iterator.scala:1157) scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) scala.collection.AbstractIterator.toArray(Iterator.scala:1157) org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112) org.apache.spark.scheduler.Task.run(Task.scala:51) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) What am I doing wrong? -- Best regards, Martin Gammelsæter